Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWS SDK V2 S3 fetch object is not fetching objects more than 1000

I am using AWS SDK version : 2.16.78. But the ListObjectsRequest object is not fetching objects more than 1000.

I did go through the documentation but I wasn't able to find how to set the continuous token. I am using the below code snippet

 try {
        ListObjectsRequest listObjects = ListObjectsRequest
                .builder()
                .bucket(bucketName)
                .build();

        ListObjectsResponse res = s3.listObjects(listObjects);
        List<S3Object> objects = res.contents();

        for (ListIterator iterVals = objects.listIterator(); iterVals.hasNext(); ) {
            S3Object myValue = (S3Object) iterVals.next();
            System.out.print("\n The name of the key is " + myValue.key());
         }

    } catch (S3Exception e) {
        System.err.println(e.awsErrorDetails().errorMessage());
        System.exit(1);
    }

The above code is only fetching 1000 s3 objects.

like image 673
Rushikesh Sabde Avatar asked Sep 08 '25 05:09

Rushikesh Sabde


1 Answers

As you indicated, AWS will only return up to 1000 of the objects in a bucket:

Returns some or all (up to 1,000) of the objects in a bucket.

Amazon S3 lists objects in alphabetical order. You can take advantage of this fact and provide a marker to the key that should be used to start with in the next requests, if appropriate:

try {

  ListObjectsRequest listObjects = ListObjectsRequest
    .builder()
    .bucket(bucketName)
      .build()
  ;

  ListObjectsResponse listObjectsResponse = null;
  String lastKey = null;

  do {
    if ( listObjectsResponse != null ) {
      listObjectsRequest = listObjectsRequest.toBuilder()
         .marker(lastKey)
           .build()
      ; 
    }

    listObjectsResponse = s3.listObjects(listObjectsRequest); 

    List<S3Object> objects = listObjectsResponse.contents();

    // Iterate over results
    for (ListIterator iterVals = objects.listIterator();    iterVals.hasNext(); ) {
      S3Object myValue = (S3Object) iterVals.next();
      String key = myValue.key();
      System.out.print("\n The name of the key is " + key);
      // Update the value of the last key processed
      lastKey = key;
    }
  } while ( listObjectsResponse.isTruncated() );
} catch (S3Exception e) {
  System.err.println(e.awsErrorDetails().errorMessage());
  System.exit(1);
}

Something very similar can be achieved with the v2 of the list objects API ListObjectsV2Request startAfter method.

With v2, you can use ListObjectsV2Response and continuation token as well. Something similar to:

try {

  ListObjectsV2Request listObjects = ListObjectsV2Request
    .builder()
    .bucket(bucketName)
      .build()
  ;

  ListObjectsV2Response listObjectsResponse = null;
  String nextContinuationToken = null;

  do {
    if ( listObjectsResponse != null ) {
      listObjectsRequest = listObjectsRequest.toBuilder()
         .continuationToken(nextContinuationToken)
           .build()
      ; 
    }

    listObjectsResponse = s3.listObjectsV2(listObjectsRequest); 
    nextContinuationToken = listObjectsResponse.nextContinuationToken();

    List<S3Object> objects = listObjectsResponse.contents();

    // Iterate over results
    for (ListIterator iterVals = objects.listIterator();    iterVals.hasNext(); ) {
      S3Object myValue = (S3Object) iterVals.next();
      String key = myValue.key();
      System.out.print("\n The name of the key is " + key);
    }
  } while ( listObjectsResponse.isTruncated() );
} catch (S3Exception e) {
  System.err.println(e.awsErrorDetails().errorMessage());
  System.exit(1);
}

Finally, you can use the listObjectsV2Paginator method to iterate over the results in a similar way like listNextBatchOfObjects was used in the v1 of the API. See for instance this related v1 code and these 1 2 related SO questions.

All the mappings between operations from v1 and v2 versions of the API are documented here.

like image 93
jccampanero Avatar answered Sep 10 '25 03:09

jccampanero