What is the best tool to find the number of rows in each Cassandra partition? I have a big partition and I want to know how much records are there in that partition
nodetool tablehistograms <keyspace> <table> will give you the distribution of the cells and sizes of thee partition for the table. But that does not give you for sure that partition. To get the specific one you must use count(*) on a select query that specifies the partition key in where clause. A very large partition and that can fail though.
sstablemetadata after 4.0 is based off the describe command in sstable-tools. It will give you the partitions largest in size, and largest in number of rows, and the partitions with most tombstones if you provide the -s to scan the sstable. These can be used against 3.0 and 3.11 sstables. I think 2.1 sstables are not able to be processed though.
...
Partitions: 22515                                                               
Rows: 13579337
Tombstones: 0
Cells: 13579337
Widest Partitions: 
  [12345] 999999
  [99049] 62664
  [99007] 60437
  [99017] 59728
  [99010] 59555
Largest Partitions: 
  [12345] 189888705
  [99049] 2965017
  [99007] 2860391
  [99017] 2826094
  [99010] 2818038
...
above example has parititon key an int, it will print out key like:
Widest Partitions: 
  [frodo] 1
Largest Partitions: 
  [frodo] 104
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With