I have an excel sheet which contains data generated by running a program with different configurations. Two sample outputs are as follows. My first problem is pd.read_excel() introduces incremental digit suffices to the column names after the first group is read. For example the first dataframe has the correct column names "A,B,C,D" while the second dataframe has "A.1,B.1,C.1,D.1" and the third "A.2,B.2,C.2,D.2" and so on. How do I make sure that all have the same column names?
My data on excel

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
file_locn = ''r'C:\Users\me\Desktop\output.xlsx'''
df = pd.read_excel(file_locn, sheet_name='1', skiprows=1)
print(df)
df_256 = df[df.columns[0:5]]
df_128 = df[df.columns[6:11]]
df_64 = df[df.columns[12:17]]
print(df_256)
print(df_128)
print(df_64)
Output:
print(df)

print(df_256)

print(df_128)

If duplicated columns names pandas add .Number for create unique columns names.
Here is possible create MultiIndex in columns like:
df = pd.read_excel(file_locn, sheet_name='1', header=[0,1])
And then use DataFrame.xs, if first level is parsed like strings use:
df_256 = df.xs('256', axis=1, level=0)
df_128 = df.xs('128', axis=1, level=0)
df_64 = df.xs('64', axis=1, level=0)
If parsed like integers:
df_256 = df.xs(256, axis=1, level=0)
df_128 = df.xs(128, axis=1, level=0)
df_64 = df.xs(64, axis=1, level=0)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With