Is it possible to recover from a network partition in an mnesia cluster without restarting any of the nodes involved? If so, how does one go about it?
I'm interested specifically in knowing:
While I appreciate the pointers to general distributed systems theory, in this question I am interested in erlang/OTP mnesia only.
What Exactly is a Network Partition? A network partition in the context of distributed SQL databases like YugabyteDB happen when the network connectivity is split between the nodes due to a failure. For example, when a switch between two subnets fails.
While we refer to "network" partitions, really a partition is any case in which the different nodes of a cluster can have communication interrupted without any node failing.
A network partition is a division of a computer network into relatively independent subnets, either by design, to optimize them separately, or due to the failure of network devices. Distributed software must be designed to be partition-tolerant, that is, even after the network is partitioned, it still works correctly.
After some experimentation I've discovered the following:
force_load_table after the network is partitioned.So to answer my question, one can perform semi online recovery by executing mnesia:stop(), mnesia:start() on the nodes in the partition whose data you decide to discard (which I'll call the losing partition). Executing the mnesia:start() call will cause the node to contact the nodes on the other side of the partition. If you have more than one node in the losing partition, you may want to set the master nodes for table loading to nodes in the winning partition - otherwise I think there is a chance it will load tables from another node in the losing partition and thus return to the partitioned network state.
Unfortunately mnesia provides no support for merging/reconciling table contents during the startup table load phase, nor does it provide for going back into the table load phase once started.
A merge phase would be suitable for ejabberd in particular as the node would still have user connections and thus know which user records it owns/should be the most up-to-date for (assuming one user conneciton per cluster). If a merge phase existed, the node could filter userdata tables, save all records for connected users, load tables as per usual and then write the saved records back to the mnesia cluster.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With