Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Key given by lambda S3 event cannot be used when containing non-ASCII characters

I have a Python lambda script that shrinks images as they are uploaded to S3. When the uploaded filename contains non-ASCII characters (Hebrew in my case), I cannot get the object (Forbidden as if the file doesn't exist).

Here's (some of) my code:

s3_client = boto3.client('s3')
def handler(event, context):
    for record in event['Records']:
        bucket = record['s3']['bucket']['name']
        key = record['s3']['object']['key']
        s3_client.download_file(bucket, key, "/tmp/somefile")

This raises An error occurred (403) when calling the HeadObject operation: Forbidden: ClientError. I also see in the log that the key contains characters like %D7%92.

Following the web I also tried to unquote the key according to some sources (http://blog.rackspace.com/the-devnull-s3-bucket-hacking-with-aws-lambda-and-python/) like so, with no luck:

key = urllib.unquote_plus(record['s3']['object']['key'])

Same error, although this time the log states that I'm trying to retrieve a key with characters like this: פ×קס×.

Note that this script is verified to work on English keys, and the tests were done on keys with no spaces.

like image 268
Oded Niv Avatar asked Nov 24 '25 10:11

Oded Niv


2 Answers

#This worked for me
import urllib.parse
encodedStr = 'My+name+is+Tarak'
urllib.parse.unquote_plus(encodedStr)
"My name is Tarak"
like image 108
Tarak Avatar answered Nov 25 '25 22:11

Tarak


I had a similar problem. I solved it adding an encode before doing the unquote:

key = urllib.unquote_plus(event['Records'][0]['s3']['object']['key'].encode("utf8"))
like image 32
Luis Govea Avatar answered Nov 25 '25 23:11

Luis Govea