Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to write a pandas dataframe to_json() to s3 in json format

I have an AWS lambda function which creates a data frame, I need to write this file to a S3 bucket.

import pandas as pd
import boto3
import io

# code to get the df

destination = "output_" + str(datetime.datetime.now().strftime('%Y_%m_%d_%H_%M_%S')) + '.json'

df.to_json(destination) # this file should be written to S3 bucket

like image 645
mellifluous Avatar asked Oct 17 '25 13:10

mellifluous


1 Answers

The following code runs in AWS Lambda and uploads the json file to S3.

Lambda role should have S3 access permissions.

import pandas as pd
import boto3
import io

# code to get the df

destination = "output_" + str(datetime.datetime.now().strftime('%Y_%m_%d_%H_%M_%S')) + '.json'

json_buffer = io.StringIO()

df.to_json(json_buffer)

s3 = boto3.resource('s3')
my_bucket = s3.Bucket('my-bucket-name')

my_bucket.put_object(Key=destination, Body=json_buffer.getvalue())


like image 86
mellifluous Avatar answered Oct 20 '25 04:10

mellifluous