I am using python in AWS Lambda function to list keys in a s3 bucket that begins with a specific id
for object in mybucket.objects.all():
file_name = os.path.basename(object.key)
match_id = file_name.split('_', 1)[0]
The problem is if a s3 bucket has several thousand files the iteration is very inefficient and sometimes lambda function times out
Here is an example file name
https://s3.console.aws.amazon.com/s3/object/bucket-name/012345_abc_happy.jpg
i want to only iterate objects that contains "012345" in the key name Any good suggestion on how i can accomplish that
Here is how you need to solve it.
S3 stores everything as objects and there is no folder or filename. It is all for user convenience.
aws s3 ls s3://bucket/folder1/folder2/filenamepart --recursive
will get all s3 objects name that matches to that name.
import boto3
s3 = boto3.resource('s3')
my_bucket = s3.Bucket('bucketname')
for obj in my_bucket.objects.filter(Prefix='012345'):
print(obj)
To speed up the list you can run multiple scripts parallelly.
Hope it helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With