Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Error while converting a mongoDB Cluster into a Replica Set

Tags:

mongodb

We followed the instructions to Convert a Cluster with a Single Shard into a Replica Set but as soon as we restarted the first Secondary (of a total of 3 secondaries + 1 primary) without the --shardsvr option, all database clients (which are connecting already directly to the replSet without problems instead to the mongoS routers) received the following error while querying the database:

Query failed with error code 211 and error message 'Cache Reader No keys found for HMAC that is valid for time: { ts: Timestamp(1585205456, 422) } with id: 6802955028354040016' on server our-db-server.domain.com:27017

Therefore, we have immediately reversed the change. This error makes it impossible for us to convert the single-shard cluster into a standalone replSet. How to proceed? Thanks!

like image 709
Kay Avatar asked Sep 06 '25 19:09

Kay


1 Answers

I think the likely case is that the vector clock clusterTime is out of sync between replicas and or your client.

This section specifically deals with how clusterTime is used for HMAC signatures.

clusterTime ticks on essentially every write to a PRIMARY so if you are changing the cluster's configuration in some way a tick would be captured. If your client does not correctly update its clusterTime after that tick it may get caught trying to use an old key for HMAC signing of its requests.

Likely something is wrong with the cluster as stated above and the client heartbeat isn't correcting the clock.

It could also be that your client library is not updating clusterTime when doing auth hand shakes as per this ticket: https://jira.mongodb.org/browse/GODRIVER-1584.


I'm just going to take some notes here as I am also seeing this error when upgrading from 3.6 to 4.0.21.

I figured out this is returned from MongoDB by searching the repo using sourcegraph.

The error comes from KeysCollectionManager::getKeysForValidation, and is rooted in the KeysCollectionCache::getInternalByKeyId method.

This particular KeysCollectionManager does seem to only instantiate itself for replicasets. It is instantiated using kKeyManagerPurposeString, whose value is always "HMAC". It's purpose, I'm not sure.

The globalsign/mgo fork of the go client has some test harnesses that indicate that they've run into this before, but don't necessarily know where it comes from. They hypothesize it only happens when your replica set uses an internal auth keyfile.

I would hope that this situation would be resolved by multiple retries until successful. I wager it's mostly an issue of the cluster not being in a ready state.

My situation is vastly different from the question's as I am using a stand alone replica set without sharding. I would hope retrying requests would help, but it could be a cluster membership issue that needs to be resolved on MongoDB itself.

This post in the MongoDB community likely indicates it is a cluster internals issue and looking for suspicious MongoDB logs would be key.

like image 61
Breedly Avatar answered Sep 09 '25 10:09

Breedly