Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

zlib difference in size for level=0 between Python 3.9 and 3.10

In this code that uses zlib to encode some data, but with level=0 so it's not actually compressed:

import zlib

print('zlib.ZLIB_VERSION', zlib.ZLIB_VERSION)

total = 0
print('Total 1', total)
compress_obj = zlib.compressobj(level=0, memLevel=9, wbits=-zlib.MAX_WBITS)
total += len(compress_obj.compress(b'-' * 1000000))
print('Total 2', total)
total += len(compress_obj.flush())
print('Total 3', total)

Python 3.9.12 outputs

zlib.ZLIB_VERSION 1.2.12
Total 1 0
Total 2 983068
Total 3 1000080

but Python 3.10.6 (and Python 3.11.0) outputs

zlib.ZLIB_VERSION 1.2.13
Total 1 0
Total 2 1000080
Total 3 1000085

so both a different final size, and a different size along the way.

Why? And how can I get them to be identical? (I'm writing a library where I would prefer identical behaviour between Python versions)

like image 533
Michal Charemza Avatar asked Mar 25 '26 15:03

Michal Charemza


1 Answers

zlib 1.2.12 and 1.2.13 behave identically in this regard. The Python library must be making different deflate() calls with different amounts of data, and possibly introducing a flush in the later version. You can look in the Python source code to find out.

You should be able to force identical output if you feed smaller amounts of data to .compress() each time, e.g. less than 64K-1, and use .flush() after each. The output will be larger, but should be identical across versions.

A quick look turned up this commit, which is likely the culprit.

like image 90
Mark Adler Avatar answered Mar 28 '26 04:03

Mark Adler



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!