Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Import Excel (.xlsx) with SSIS and Excel source with worksheet containing multiple subsections

Tags:

import

excel

ssis

I need your help.

I have an .xlsx file which looks like this:

enter image description here

My goal is to create a SSIS package which pushes this data into a DB table.

Now, col1 to col5 is ok, but each section has a name on top of it and that is supposed to be column 6 in the table. So the final destination table looks like:

col1 | col2 | col3 | col4 | col5 | col6 (Firstname Lastname from the top)

So far I have tried:

  1. Creating a recordset out of the excel sheet
  2. read recordset row by row using Ado foreach enumerator
  3. within the for each enumerator I have a set of variables that will represent the columns
  4. these variables I am flushing out to a data flow task which converts variables to columns using derived column and pushes it into the ODBC destination

Obviously this did not work out for me, I always get the message "0 rows inserted in the ODBC destination" when I run the package.

To be honest I am not really sure how to solve this problem.

Any help is highly appreciated!!!

Thanks in Advance!!!

Edit:

PS: I can not use any one time or Power BI / Query tricks here. It has to be pure SSIS.

like image 502
SutharMonil Avatar asked Sep 08 '25 06:09

SutharMonil


1 Answers

The solution I can think of, is through script task, in these steps

  1. Read the Excel file from row 4
  2. Script task to add name to the data rows
  3. Remove column headings via conditional split
  • for the first step open Excel source properties, unselect column names in the first row, and set the OpenRowset to read from row four (Sheet1$A4:E) :

enter image description here

  • for the second step create a script component transformation. Select the 5 columns as input and create a new output variable (Name, in this case)

enter image description here

The script itself:

public class ScriptMain : UserComponent
{


    string keepname;
    
    public override void Input0_ProcessInputRow(Input0Buffer Row)
    {
        if (Row.F5_IsNull)
            { keepname = Row.F1;
            
        }
        else
        {
            if (Row.F5 != "col5")
            {
                Row.Name = keepname;
            }

        }
    }
    
}

Explanation: checks if it is a row containing name (col5 is null) and it so, it saves the name to variable keepname. if is not that, and it is not a heading (col1='col1') then update the variable Name.

  • Last step is just a clean up via conditional split

enter image description here

What this does, is split the rows that are 'name rows' or headings. The you just need to carry one with default output.

Test:

enter image description here

like image 142
Jayvee Avatar answered Sep 10 '25 22:09

Jayvee