Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas to_excel as variable (without destination file) [duplicate]

I recently had to take a dataframe and prepare it to output to an Excel file. However, I didn't want to save it to the local system, but rather pass the prepared data to a separate function that saves to the cloud based on a URI. After searching through a number of ExcelWriter examples, I couldn't find what I was looking for.

The goal is to take the dataframe, e.g.:

df = pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6})

And temporarily store it as bytes in a variable, e.g.:

processed_data = <bytes representing the excel output>

The solution I came up with is provided in the answers and hopefully will help someone else. Would love to see others' solutions as well!

Update #2 - Example Use Case

In my case, I created an io module that allows you to use URIs to specify different cloud destinations. For example, "paths" starting with gs:// get sent to Google Storage (using gsutils-like syntax). I process the data as my first step, and then pass that processed data to a "save" function, which itself filters to determine the right path.

df.to_csv() actually works with no path and automatically returns a string (at least in recent versions), so this is my solution to allow to_excel() to do the same.

like image 263
ZaxR Avatar asked Dec 01 '25 06:12

ZaxR


1 Answers

Works like the common examples, but instead of specifying the file in ExcelWriter, it uses the standard library's BytesIO to store in a variable (processed_data):

from io import BytesIO

import pandas as pd


df = pd.DataFrame({
    "a": [1, 2, 3], 
    "b": [4, 5, 6]
})

output = BytesIO()
writer = pd.ExcelWriter(output)
df.to_excel(writer)  # plus any **kwargs
writer.save()
processed_data = output.getvalue()
like image 76
ZaxR Avatar answered Dec 03 '25 19:12

ZaxR



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!