Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

S3 bucket size differs while using console and CLI

I am trying to get my S3 bucket size. I have tried both the console method and CLI method. But, I am getting 19.6 GB from CLI output and 21.3 GB from console.

For console method, I go to s3 bucket => Management Tab => Metrics and see there.

For CLI method, I have entered the following command: aws s3 ls s3://bucket-name --recursive --human-readable --summarize

In which output, should I rely on?

like image 505
Deependra Dangal Avatar asked Sep 01 '25 10:09

Deependra Dangal


2 Answers

This is interesting!

I have a bucket with three objects. When I use aws s3 ls, it shows the sizes as: (listing has been shortened to show relevant parts only)

$ aws s3 ls s3://my-bucket

foo1 2B
foo2 956.9 KiB
foo3 7.7 KiB


$ aws s3 ls s3://my-bucket --summarize
foo1          2
foo2     979820
foo3       7864

Total Size: 987686


$ aws s3 ls s3://my-bucket --summarize --human-readable

foo1    2 Bytes
foo2  956.9 KiB
foo3    7.7 KiB

Total Objects: 3
   Total Size: 964.5 KiB

Note that 964.5 * 1024 = 987,648, so that's effectively the same.

The listing in the Amazon S3 management console matches the first listing.

In Amazon CloudWatch, the metric is provided as 988k, which seems to be a rounding of the decimal version (987,686). So, it is using 1k=1000 instead of 1k=1024. This is good for a generalized metric (eg number of requests), but does not match the convention of memory and disk, which normally uses 1024. (For some history, see: Wikipedia - Kibibyte

Another thing to consider is that some metrics might be including multiple versions (normally only the latest version of an object appears in a listing) and incomplete multi-part uploads that are not listed as objects. This could lead to a discrepancy when listing objects, yet they still occupy space. You can create some lifecycle rules to clean up multi-part uploads.

like image 99
John Rotenstein Avatar answered Sep 03 '25 23:09

John Rotenstein


Your CLI approach output can also be found from the AWS console as follows: From the console, you can get the same output by trying out this approach. Select all objects in the S3 bucket and under actions, select Get Total Size.

Although there are many other good solutions like s3cmd du, the calculating approaches would be slower either via CLI or AWS console as in case of large buckets.

I would recommend that you should rely on calculating approach as it is going to give more precise value but prefer AWS console output via CloudWatch metric that is under management tab if you want to save calculation time as you don't have to calculate the size of the S3 bucket on your own but bucket shouldn't have either incomplete multi-part uploads or versioned objects as they might be included in the metric value as well.

like image 29
Abdullah Khawer Avatar answered Sep 03 '25 22:09

Abdullah Khawer