I am bulk inserting a csv file into SQL Server 2012. The data is currently | pipe delimited as one long string for each row. I'd like to separate the data into the different columns at each pipe.
Here is how the data looks as its imported:
ID|ID2|Person|Person2|City|State
"1"|"ABC"|"Joe"|"Ben"|"Boston"|"MA"
"2"|"ABD"|"Jack"|"Tim"|"Nashua"|"NH"
"3"|"ADC"|"John"|"Mark"|"Hartford"|"CT"
I'd liek to separate the data into the columns at each pipe:
ID ID2 Person Person2 City State
1 ABC Joe Ben Boston MA
2 ABD Jack Tim Nashua NH
3 AFC John Mark Hartford CT
I'm finding it difficult to use charindex and substring functions because of the number of columns of the data also I've tried to use ParseName since that is a 2012 function but thats not working either as all the columns come out as NULL with ParseName
The file contains about 300k rows and I've found a solution using xmlname but it is very slow. ie: takes a minute to separate the data.
Here's the slow xml solution:
CREATE TABLE #tbl(iddata varchar(200))
DECLARE @i int = 0
WHILE @i < 100000
BEGIN
SET @i = @i + 1
INSERT INTO #tbl(iddata)
SELECT '"1"|"ABC"|"Joe"|"Ben"|"Boston"|"MA"'
UNION ALL
SELECT '"2"|"ABD"|"Jack"|"Tim"|"Nashua"|"NH"'
UNION ALL
SELECT '"3"|"AFC"|"John"|"Mark"|"Hartford"|"CT"'
END
;WITH XMLData
AS
(
SELECT idData,
CONVERT(XML,'<IDs><id>'
+ REPLACE(iddata,'|', '</id><id>') + '</id></IDs>') AS xmlname
FROM (
SELECT REPLACE(iddata,'"','') as iddata
FROM #tbl
)x
)
SELECT xmlname.value('/IDs[1]/id[1]','varchar(100)') AS ID,
xmlname.value('/IDs[1]/id[2]','varchar(100)') AS ID2,
xmlname.value('/IDs[1]/id[3]','varchar(100)') AS Person,
xmlname.value('/IDs[1]/id[4]','varchar(100)') AS Person2,
xmlname.value('/IDs[1]/id[5]','varchar(100)') AS City,
xmlname.value('/IDs[1]/id[6]','varchar(100)') AS State
FROM XMLData
This will do it for you.
CREATE TABLE #Import (
ID NVARCHAR(MAX),
ID2 NVARCHAR(MAX),
Person NVARCHAR(MAX),
Person2 NVARCHAR(MAX),
City NVARCHAR(MAX),
State NVARCHAR(MAX))
SET QUOTED_IDENTIFIER OFF
BULK INSERT #Import
FROM 'C:\MyFile.csv'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = '|',
ROWTERMINATOR = '\n',
ERRORFILE = 'C:\myRubbishData.log'
)
select * from #Import
DROP TABLE #Import
Unfortunately using BULK INSERT will not deal with text qualifiers, so you will end up with "ABC" rather than ABC.
Either remove the text qualifiers from the csv file, or run a replace on your table once the data has been imported.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With