Raft Leader Election in Consul
A small paper reading group has assembled at work. We give ourselves two to three weeks to read a paper, meetup after hours, eat pizza, and discuss it. Our last paper focused on the Raft consensus algorithm, and I was chosen to lead the discussion.
In order to help the impact of Raft hit closer to home, I put together a small demo of Raft’s leader election process using Consul. The demo spins up a three node Consul cluster using containers, then interleaves all of the debug log output filtered with
raft. Reading through parts of the Raft paper, you can see how the logging output of HashiCorp’s implementation lines up.
Section 5.2 of the Raft paper focuses on leader election, and starts off with:
When servers start up, they begin as followers.
Sure enough, the first
raft filtered logs start with:
$ docker-compose up | grep raft consul1 | [INFO] raft: Node at 172.17.0.45:8300 [Follower] entering Follower state consul2 | [INFO] raft: Node at 172.17.0.44:8300 [Follower] entering Follower state consul3 | [INFO] raft: Node at 172.17.0.43:8300 [Follower] entering Follower state
Next is the the beginning of an election:
If a follower receives no communication over a period of time called the election timeout, then it assumes there is no viable leader and begins an election to choose a new leader.
That corresponds with:
consul1 | [WARN] raft: Heartbeat timeout reached, starting election
Now that the election started, there needs to be a winner:
A candidate wins an election if it receives votes from a majority of the servers in the full cluster for the same term.
Which goes with:
consul1 | [DEBUG] raft: Votes needed: 2 consul1 | [DEBUG] raft: Vote granted. Tally: 1 consul1 | [DEBUG] raft: Vote granted. Tally: 2 consul1 | [INFO] raft: Election won. Tally: 2 consul1 | [INFO] raft: Node at 172.17.0.45:8300 [Leader] entering Leader state
AppendEntries is used to communicate the new leader to all other candidates:
While waiting for votes, a candidate may receive an AppendEntries RPC from another server claiming to be leader.
consul1 show that it is replicating to
consul1 | [INFO] raft: pipelining replication to peer 172.17.0.44:8300 consul1 | [INFO] raft: pipelining replication to peer 172.17.0.43:8300