Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading excel files in pandas

I am trying to read an Excel file using pandas but I am not sure if I am able to read the way I need.

My file is like this:

enter image description here

I am reading the file like this:

excel_file = pd.ExcelFile('MY_FILE')
df = excel_file.parse(sheet_name=0, header=1)

This way I am able to read but I am not sure from which group each variable belongs. In this case, for each column I need to know from which group they are coming from. Is there any way to do this?

Thank you!

like image 757
briba Avatar asked Nov 24 '25 02:11

briba


2 Answers

Here is possible specify first and second row in parameter header for MultiIndex in columns and index_col for index from first column in function read_excel:

df = pd.read_excel('file.xlsx', header=[0,1], index_col=[0], sheet_name=0)

Your solution should be changed with same parameters:

excel_file = pd.ExcelFile('file.xlsx')
df = excel_file.parse(header=[0,1], index_col=[0], sheet_name=0)

print (df)
CUSTOM NAME   g1      g2          
NAME           A    B  A    B    C
NAME 1       1.0  NaN  1  NaN  1.0
NAME 1       NaN  1.0  1  1.0  NaN

print (df.columns)
MultiIndex(levels=[['g1', 'g2'], ['A', 'B', 'C']],
           codes=[[0, 0, 1, 1, 1], [0, 1, 0, 1, 2]],
           names=['CUSTOM NAME', 'NAME'])

print (df.index)
Index(['NAME 1', 'NAME 1'], dtype='object')

Filtering working with tuples for select columns of MultiIndex:

print (df[df[('g1', 'A')] == 1])
CUSTOM NAME   g1     g2         
NAME           A   B  A   B    C
NAME 1       1.0 NaN  1 NaN  1.0

More information in Select rows in pandas MultiIndex DataFrame, only remove loc, because MultiIndex in columns.

like image 178
jezrael Avatar answered Nov 25 '25 15:11

jezrael


You can use MultiIndex if you pass a list of integers to header:

excel_file = pd.ExcelFile('example.xlsx')
df = excel_file.parse(sheet_name=0, header=[0,1])

the dataframe:

CUSTOM NAME     GROUP 1     GROUP 2
NAME            A   B       A   B   C
NAME 1          1.0 NaN     1   NaN 1.0
NAME 2          NaN 1.0     1   1.0 NaN

Documentation: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_excel.html

like image 37
PythonSherpa Avatar answered Nov 25 '25 15:11

PythonSherpa



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!