I use Kafka 0.10, I have a Topic logs
where my IoT devices post their logs into , The key of my messages are the device-id
, so all the logs of the same device are in the same partition.
I have an api /devices/{id}/tail-logs
that needs to display the N last logs of one device at the moment the call was made.
Currently I have it implemented in a very unefficient way (but working), as I start from the beginning (i.e oldest logs) of the partition containing the device's log until I reach current timestamp.
A more efficient way would be if I could get the current latest offset and then consume the messages backward (I would need to filter out some message to keep only those of the device i'm looking for)
Is it possible to do it with kafka ? If not how one can solve this problematic ? (a more heavy solution I would see would be to have a kafka-connect linked to an elastic search and then to query the elasticsearch but to have 2 more components for this seems a bit overkill...)
As you are on 0.10.2, I would recommend to write a Kafka Streams application. The application will be stateful and the state will hold the last N records/logs per device-id
-- if new data is written to the input topic, the Kafka Streams application will just update it's state (without the need to re-read the whole topic).
Furthermore, the application also serves you request ("api /devices/{id}/tail-logs
" using Interactive Queries feature.
Thus, I would not build a stateless application that has to recompute the answer for each request, but build a stateful application that eagerly compute the result (and update the result automatically all the time) for all possible requests (ie, for all device-id
s) and just returns the already computed result when a request comes in.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With