How to find the number of objects in an S3 bucket? - Big Data In Real World

How to find the number of objects in an S3 bucket?

Improving Performance with Adaptive Query Execution in Apache Spark 3.0
August 14, 2023
What is the default number of cores and amount of memory allocated to an application in Spark?
September 18, 2023
Improving Performance with Adaptive Query Execution in Apache Spark 3.0
August 14, 2023
What is the default number of cores and amount of memory allocated to an application in Spark?
September 18, 2023

There is no separate command in AWS CLI to find the number of objects in an S3 bucket but there is a workaround.

Solution

aws s3 listing with recursive option will list all the objects in the bucket.

[osboxes@wk1 ~]$ aws s3 ls s3://hirw-kickstarter --recursive
2020-11-18 16:46:03          0 Output2/_SUCCESS2
2020-11-18 16:43:34        111 Output2/_committed_1768500173264775255
2020-11-18 16:43:34          0 Output2/_started_1768500173264775255
2020-11-18 16:43:34     170043 Output2/part-00000-tid-1768500173264775255-99c4578a-e958-4e22-869b-46bfaf1acbfa-611-c000.csv
2018-04-07 16:20:35   46500324 ks-projects-201612.csv

Use aws s3 listing with recursive option piping the output to the count.

[osboxes@wk1 ~]$ aws s3 ls s3://hirw-kickstarter --recursive | wc -l
5

Note that folder “Output2” is not counted as an object. The Amazon S3 console implements folder object creation by creating a zero-byte object. They are only used for organization purposes.

Big Data In Real World
Big Data In Real World
We are a group of Big Data engineers who are passionate about Big Data and related Big Data technologies. We have designed, developed, deployed and maintained Big Data applications ranging from batch to real time streaming big data platforms. We have seen a wide range of real world big data problems, implemented some innovative and complex (or simple, depending on how you look at it) solutions.

Comments are closed.

How to find the number of objects in an S3 bucket?
This website uses cookies to improve your experience. By using this website you agree to our Data Protection Policy.

Hadoop In Real World is now Big Data In Real World!

X