Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Concatenate Values from multiple rows in SSIS

Tags:

ssis

I have a file source where the data looks like this:

ID BarcodeNumber
------------------------
1  123456789    
1  33    
2  987654321    
2  44

I would like to get the data as follows:

ID BarcodeNumber
------------------------
1  12345678933    
2  98765432144

Any suggestions on how to accomplish this?

EDIT: I would like to do this in the data flow as this could potentially have billions of rows.

like image 908
user1389739 Avatar asked Oct 25 '25 05:10

user1389739


2 Answers

The desired state is that for each ID, all of the barcodes will be concatenated together.

To accomplish this, your data must be sorted by ID. If the file isn't sorted already, then you'll need to run it through the Sort task.

Once you have sorted data, then you will need to add a Script Transformation in Asynchronous mode. It'll be async since the number of rows in will not be the same number as rows out.

The psuedo code will be something like

If Row.ID <> LastID
    If LastID <> ""
        Output0Buffer.ID = LastID
        Output0Buffer.Barcodes = Barcodes
    LastID = Row.ID
    Barcodes = ""

Barcodes = Barcodes + Row.Barcodes

Caveats

  • The above code is neither C# or VB.NET but the logic should be sound
  • If you have billions of rows, you are going to need some serious RAM on this machine as SSIS is an in-memory transformation system.
  • The Sort operation and this script task are going to be slow and memory intensive as they cannot re-use memory space
like image 101
billinkc Avatar answered Oct 27 '25 00:10

billinkc


Data flow

Source File:

ID,RowNum,Barcode   
1,1,123456789  
1,2,33    
2,1,987654321    
2,2,44

Split the rows of the file. For the simple case as assumed in the source file here, we can use the conditional split. for more involved situation, we can use Script component as the splitter. From here on, the solution would be pretty standard wrt sort and merge. Once merged, add a derived column: TRIM(Barcode1) + TRIM(Barcode2) where Barcode1 is coming from the "left side" which has rows with RowNum 1, and Barcode2 from the "right side". If you are concerned about the performance of sort component, put the split rows into two tables, and compare the performance.

Cond. Split: RowNum1 (RowNum == "1"), RowNum2 (RowNum == "2")
Sort: By ID; Pass through: Barcode
Merge Join: Left outer join, Join key: ID; Output: ID, Barcode1 and Barcode2
Derived column: TRIM(Barcode1) + TRIM(Barcode2)
like image 40
Anoop Verma Avatar answered Oct 26 '25 23:10

Anoop Verma



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!