I have a big gzip file and I would like to read only parts of it using seek.
About the use of seek on gzip files, this page says:
The seek() position is relative to the uncompressed data, so the caller does not even need to know that the data file is compressed.
Does this imply that seek has to read and decompress the data from the beginning of the file to the target position?
Yes. This is the code:
elif self.mode == READ:
if offset < self.offset:
# for negative seek, rewind and do positive seek
self.rewind()
count = offset - self.offset
for i in range(count // 1024):
self.read(1024)
self.read(count % 1024)
Alternatives are discussed here. The problem is inherent to the gzip format.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With