Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kafka Stream Rebalancing : State transition from REBALANCING to ERROR

I have 4 topics with single partition and three instances of the application. I tried to achieve scalability by writing a custom PartitionGrouper which would create 3 tasks as below:

1st instance-topic1,partition0,topic4,partition0

2nd instance-topic2,partition0

3rd instance-topic3,partition0

I configured NUM_STANDBY_REPLICAS_CONFIG to 1 since it would maintain states locally(also to eliminate invalidstatestore exception).

The above setup worked fine for two instances. When I increased it to three instances I started facing issues w.r.t rebalancing.

StickyTaskAssignor:58 - Unable to assign 1 of 1 standby tasks for task [1009710637_0]. There is not enough available capacity. You should increase the number of threads and/or application instances to maintain the requested number of standby replicas.
    [INFO ] 2017-12-25 20:05:42.221 [app-03-cfaf7841-dc19-4ee4-9d05-ae4928c21449-StreamThread-1] StreamThread:888 - stream-thread [app-03-cfaf7841-dc19-4ee4-9d05-ae4928c21449-StreamThread-1] State transition from PARTITIONS_REVOKED to PARTITIONS_ASSIGNED.
    [INFO ] 2017-12-25 20:05:42.221 [app-03-cfaf7841-dc19-4ee4-9d05-ae4928c21449-StreamThread-1] KafkaStreams:268 - stream-client [app-03-cfaf7841-dc19-4ee4-9d05-ae4928c21449] State transition from REBALANCING to REBALANCING.
    [INFO ] 2017-12-25 20:05:42.276 [app-03-cfaf7841-dc19-4ee4-9d05-ae4928c21449-StreamThread-1] StreamThread:195 - stream-thread [app-03-cfaf7841-dc19-4ee4-9d05-ae4928c21449-StreamThread-1] partition assignment took 55 ms.
    current active tasks: [1009710637_0]
    current standby tasks: [1240464215_0, 1833680710_0]
    previous active tasks: []
    [INFO ] 2017-12-25 20:05:42.631 [app-03-cfaf7841-dc19-4ee4-9d05-ae4928c21449-StreamThread-1] StreamThread:939 - stream-thread [app-03-cfaf7841-dc19-4ee4-9d05-ae4928c21449-StreamThread-1] Shutting down
    [INFO ] 2017-12-25 20:05:42.631 [app-03-cfaf7841-dc19-4ee4-9d05-ae4928c21449-StreamThread-1] StreamThread:888 - stream-thread [app-03-cfaf7841-dc19-4ee4-9d05-ae4928c21449-StreamThread-1] State transition from PARTITIONS_ASSIGNED to PENDING_SHUTDOWN.
    [INFO ] 2017-12-25 20:05:42.633 [app-03-cfaf7841-dc19-4ee4-9d05-ae4928c21449-StreamThread-1] KafkaProducer:972 - Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms.
    [INFO ] 2017-12-25 20:05:42.638 [app-03-cfaf7841-dc19-4ee4-9d05-ae4928c21449-StreamThread-1] StreamThread:972 - stream-thread [app-03-cfaf7841-dc19-4ee4-9d05-ae4928c21449-StreamThread-1] Stream thread shutdown complete
    [INFO ] 2017-12-25 20:05:42.638 [app-03-cfaf7841-dc19-4ee4-9d05-ae4928c21449-StreamThread-1] StreamThread:888 - stream-thread [app-03-cfaf7841-dc19-4ee4-9d05-ae4928c21449-StreamThread-1] State transition from PENDING_SHUTDOWN to DEAD.
    [WARN ] 2017-12-25 20:05:42.638 [app-03-cfaf7841-dc19-4ee4-9d05-ae4928c21449-StreamThread-1] KafkaStreams:343 - stream-client [app-03-cfaf7841-dc19-4ee4-9d05-ae4928c21449] All stream threads have died. The Kafka Streams instance will be in an error state and should be closed.
    [INFO ] 2017-12-25 20:05:42.638 [app-03-cfaf7841-dc19-4ee4-9d05-ae4928c21449-StreamThread-1] KafkaStreams:268 - stream-client [app-03-cfaf7841-dc19-4ee4-9d05-ae4928c21449] State transition from REBALANCING to ERROR.
like image 253
Viswapriya Avatar asked Nov 16 '25 04:11

Viswapriya


1 Answers

I assume that your PartitionGrouper breaks something. It's it quite hard to write a correct custom partition grouper as you need to know a lot of internals about Kafka Streams. Thus, it is not recommended in the first place.

The error itself means, that a StandbyTask cannot be assigned to a thread successfully, as there are not enough threads. In general, the idea is that a StandbyTask cannot be assigned to a thread the runs the corresponding "active" task or a another copy of the same StandbyTasks: it does not increase fault-tolerance but only wastes memory as if a thread dies, all the task dies.

Why you get this error in particular is unclear (happy debugging :)).

However, for your use case, you should just start different application instances subscribing to individual topics and using different application.id to scale out your application.

like image 57
Matthias J. Sax Avatar answered Nov 18 '25 20:11

Matthias J. Sax