Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Add header with merged cells from one excel and insert to another excel Pandas

I have been searching over on how to append/insert/concat a row from one excel to another but with merged cells. I was not able to find what I am looking for.

What I need to get is this: enter image description here

and append to the very first row of this: enter image description here

I tried using pandas append() but it destroyed the arrangement of columns.

df = pd.DataFrame()

for f in ['merge1.xlsx', 'test1.xlsx']:
    data = pd.read_excel(f, 'Sheet1')
    df = df.append(data)

df.to_excel('test3.xlsx')

Is there way pandas could do it? I just need to literally insert the header to the top row.

Although I am still trying to find a way, it would actually be fine to me if this question had a duplicate as long as I can find answers or advice.

like image 460
Ricky Aguilar Avatar asked Oct 26 '25 03:10

Ricky Aguilar


1 Answers

You can use pd.read_excel to read in the workbook with the data you want, in your case that is 'test1.xlsx'. You could then utilize openpyxl.load_workbook() to open an existing workbook with the header, in your case that is 'merge1.xlsx'. Finally you could save the new workbbok by a new name ('test3.xlsx') without changing the two existing workbooks.

Below I've provided a fully reproducible example of how you can do this. To make this example fully reproducible, I create 'merge1.xlsx' and 'test1.xlsx'.

Please note that if in your 'merge1.xlsx', if you only have the header that you want and nothing else in the file, you can make use of the two lines I've left commented out below. This would just append your data from 'test1.xlsx' to the header in 'merge1.xlsx'. If this is the case then you can get rid of the two for llops at the end. Otherwise as in my example it's a bit more complicated.

In creating 'test3.xlsx', we loop through each row and we determine how many columns there are using len(df3.columns). In my example this is equal to two but this code would also work for a greater number of columns.

import pandas as pd
from openpyxl import load_workbook
from openpyxl.utils.dataframe import dataframe_to_rows

df1 = pd.DataFrame()
writer = pd.ExcelWriter('merge1.xlsx') #xlsxwriter engine
df1.to_excel(writer, sheet_name='Sheet1')
ws = writer.sheets['Sheet1']
ws.merge_range('A1:C1', 'This is a merged cell')
ws.write('A3', 'some string I might not want in other workbooks')
writer.save()

df2 = pd.DataFrame({'col_1': [1,2,3,4,5,6], 'col_2': ['A','B','C','D','E','F']})
writer = pd.ExcelWriter('test1.xlsx')
df2.to_excel(writer, sheet_name='Sheet1')
writer.save()

df3 = pd.read_excel('test1.xlsx')
wb = load_workbook('merge1.xlsx')
ws = wb['Sheet1']
#for row in dataframe_to_rows(df3):
#    ws.append(row)
column = 2
for item in list(df3.columns.values):
    ws.cell(2, column=column).value = str(item)
    column = column + 1
for row_index, row in df3.iterrows():
    ws.cell(row=row_index+3, column=1).value = row_index #comment out to remove index
    for i in range(0, len(df3.columns)):
        ws.cell(row=row_index+3, column=i+2).value = row[i]

wb.save("test3.xlsx")

Expected Output of the 3 Workbooks:

Expected Output

like image 108
patrickjlong1 Avatar answered Oct 28 '25 16:10

patrickjlong1



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!