looking for something like this:
Save Dataframe to csv directly to s3 Python
the api shows these arguments: https://pola-rs.github.io/polars/py-polars/html/reference/api/polars.DataFrame.write_parquet.html
but i'm not sure how to convert the df into a stream...
Untested, since I don't have an AWS account
You could use s3fs.S3File like this:
import polars as pl
import s3fs
fs = s3fs.S3FileSystem(anon=True) # picks up default credentials
df = pl.DataFrame(
{
"foo": [1, 2, 3, 4, 5],
"bar": [6, 7, 8, 9, 10],
"ham": ["a", "b", "c", "d", "e"],
}
)
with fs.open('my-bucket/dataframe-dump.parquet', mode='wb') as f:
df.write_parquet(f)
Basically s3fs gives you an fsspec conformant file object, which polars knows how to use because write_parquet accepts any regular file or streams.
If you want to manage your S3 connection more granularly, you can construct as S3File object from the botocore connection (see the docs linked above).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With