Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Restore Etcd Quorum

I have a Kubernetes cluster distributed on AWS via Kops consisting of 3 master nodes, each in a different AZ. As is well known, Kops realizes the deployment of a cluster where Etcd is executed on each master node through two pods, each of which mounts an EBS volume for saving the state. If you lose the volumes of 2 of the 3 masters, you automatically lose consensus among the masters.

Is there a way to use information about the only master who still has the status of the cluster, and retrieve the Quorum between the three masters on that state? I recreated this scenario, but the cluster becomes unavailable, and I can no longer access the Etcd pods of any of the 3 masters, because those pods fail with an error. Moreover, Etcd itself becomes read-only and it is impossible to add or remove members of the cluster, to try to perform manual interventions.

Tips? Thanks to all of you

like image 582
falberto89 Avatar asked Feb 04 '26 14:02

falberto89


1 Answers

This is documented here. There's also another guide here

You basically have to backup your cluster and create a brand new one.

like image 123
Rico Avatar answered Feb 06 '26 04:02

Rico