Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to send/copy/upload file from AWS S3 to Google GCS using Python

Im looking for a pythonic way to copy a file from AWS S3 to GCS.

I do not want to open/read the file and then use blob.upload_from_string() method. I want to transfer it 'as-is'.

I can not use 'gsutils'. The scope of the libraries Im working with is gcloud, boto3 (also experimented with s3fs).

Here is a simple example (that seems to work) using blob.upload_from_string() method which im trying to avoid because i don't want to open/read the file. I fail to make it work using blob.upload_from_file() method because GCS api requires an accessible, readable, file-like object which i fail to properly provide.

What am I missing? Suggestions?

import boto3
from gcloud import storage
from oauth2client.service_account import ServiceAccountCredentials

GSC_Token_File = 'path/to/GSC_token'

s3 = boto3.client('s3', region_name='MyRegion') # im running from AWS Lambda, no authentication required

gcs_credentials = ServiceAccountCredentials.from_json_keyfile_dict(GSC_Token_File)
gcs_storage_client = storage.Client(credentials=gcs_credentials, project='MyGCP_project')
gcs_bucket = gcs_storage_client.get_bucket('MyGCS_bucket')

s3_file_to_load = str(s3.get_object(Bucket='MyS3_bucket', Key='path/to/file_to_copy.txt')['Body'].read().decode('utf-8'))
blob = gcs_bucket.blob('file_to_copy.txt')

blob.upload_from_string(s3_file_to_load)

like image 943
Alex Gertsovich Avatar asked Oct 19 '25 04:10

Alex Gertsovich


1 Answers

So i poked around a bit more and came across this article which eventually led me to this solution. Apparently GCS API can be called using AWS boto3 SDK.

Please mind the HMAC key prerequisite that can be easily created using these instructions.

import boto3

# im using GCP Service Account so my HMAC was created accordingly. 
# HMAC for User Account can be created just as well

service_Access_key = 'YourAccessKey'
service_Secret = 'YourSecretKey'

# Reminder: I am copying from S3 to GCS
s3_client = boto3.client('s3', region_name='MyRegion')
gcs_client  =boto3.client(
        "s3", # !just like that
        region_name="auto",
        endpoint_url="https://storage.googleapis.com",
        aws_access_key_id=service_Access_key,
        aws_secret_access_key=service_Secret,
    )


file_to_transfer = s3_client.get_object(Bucket='MyS3_bucket', Key='path/to/file_to_copy.txt')
gcs_client.upload_fileobj(file_to_transfer['Body'], 'MyGCS_bucket', 'file_to_copy.txt')


like image 56
Alex Gertsovich Avatar answered Oct 21 '25 16:10

Alex Gertsovich



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!