Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kafka compression: How to do compression at single message level

When I am sending messages to Kafka topic, I might get a single message which is much larger in size compared to other messages.

So it is required to compress at single message level. As per the https://cwiki.apache.org/confluence/display/KAFKA/Compression,

A set of messages can be compressed and represented as one compressed message.

Also as per the description given here https://github.com/apache/kafka/blob/0.10.1/clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java for the property compression.type,

Compression is of full batches of data, so the efficacy of batching will also impact the compression ratio (more batching means better compression).;

Shall I put batch size as one/disable batching to make the compression at each message level?

like image 437
Anil Kumar Avatar asked Sep 10 '25 18:09

Anil Kumar


1 Answers

compression is orthogonal to the question of producing in batch or not. Though, as stated in the documentation:

more batching means better compression

Compression can be set in the topic level (https://kafka.apache.org/documentation/#topicconfigs) or as part of producer config (https://kafka.apache.org/documentation/#producerconfigs) . Moreover, different messages in the same topic can be compressed with different type, as the compression type is part of the record metadata (https://kafka.apache.org/documentation/#recordbatch), and it would be seamless to the consumer.

However, if you require selectively compress different messages, it cannot be done with the same producer, as the producer configuration is static. Whatever is the motivation for such a choice, you could just create two producer instances (one that support compression and one without compression), and according to message content, decide which producer to use to send it.

like image 95
Lior Chaga Avatar answered Sep 13 '25 14:09

Lior Chaga