Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the fastest and most memory efficient BZip2 decompression tool to use in Java

Currently using the Apache Commons Compress package which uses about 60% of the overall heap and takes around 6 minutes to decompress about 500 files each 4-5Mb when decompressing BZip2 files.

My main problem is I can't find anything to compare this performance to, I have found AT4J but implementing this as per the documentation leads to an ArrayIndexOutOfBoundsException while trying to read one of the files into the buffer. For the few files it did manage to process the performance was pretty similar, and the fact that AT4J includes the compressor classes from Commons Compress to give 'an extra option' implies this is expected.

Does anyone know of any other Java libraries for decompressing BZip2 files and if so whether they are any comparison to Apache?

Thanks in advance.

like image 340
ca55idy Avatar asked Jan 01 '26 21:01

ca55idy


1 Answers

This benchmark of different compression techniques suggest they got 6 MB/s decompressing BZip2

https://tukaani.org/lzma/benchmarks.html

This suggests that your 2.2 GB of data should take about 6 minutes even with a native library.

If you want to speed this up, I suggest using multiple threads or using gzip which is much faster.

like image 165
Peter Lawrey Avatar answered Jan 03 '26 10:01

Peter Lawrey



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!