Intro:
I have a Python application using a Cassandra 1.2.4 cluster with a replication factor of 3, all reads and writes are done with a consistency level of 2. To access the cluster I use the CQL library. The Cassandra cluster is running on rackspace's virtual servers.
The problem:
From time to time one of the nodes can become slower than usual, in this case I want to be able to detect this situation and prevent making requests to the slow node and if possible to stop using it at all (this should theoretically be possible since the RF is 3 and the CL is 2 for every single request). So far the solution I came up with involves timing the requests to each of the nodes and preventing future connections to the slow node. But still this doesn't solves all the problem because even connecting to another node a particular query may end up being served by the slow node after the coordinator node routes the query.
The questions:
What's the best way of detecting the slow node from a Python application? Is there a way to stop using one of the Cassandra nodes from Python in this scenario without human intervention?
Thanks in advance!
Your manual solution of timing the requests is enough, if nodes that are slow to respond are also ones that are slow to process the query.
Internally Cassandra will avoid slow nodes if it can by using the dynamic snitch. This orders nodes by recent latency statistics and will avoid reading from the slowest nodes if the consistency level allows. NB writes go to all available nodes, but you don't have to wait for them to all respond if your consistency level allows.
There may be some client support for what you want in a python client - Astyanax in Java uses something very like the dynamic snitch in the client to avoid sending requests to slow nodes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With