Raft Consensus Algorithm

What is Consensus?

In distributed systems, consensus is the process of getting multiple, independent nodes (servers) to agree on a single value or sequence of values, even in the presence of failures (like crashes or network partitions). It's fundamental for building reliable systems where data needs to be consistent across replicas, such as distributed databases, configuration management, and state machine replication.

Why is Consensus Hard?

Raft Explained

Raft is a consensus algorithm designed to be understandable and easier to implement than its predecessor, Paxos. Its primary goal is to manage a replicated log – an ordered sequence of commands applied consistently across multiple servers.

Key Concepts

Leader Election in Raft

  1. Timeout: Followers wait for communication (AppendEntries RPCs) from a leader. If a follower's election timeout elapses without hearing from a leader (or granting a vote), it assumes the leader has failed.
  2. Become Candidate: The follower increments its current term, transitions to the Candidate state, votes for itself, and resets its election timer.
  3. Request Votes: The candidate sends RequestVote RPCs to all other servers in parallel, including its current term and information about its log.
  4. Voting: Other servers respond based on rules:
    • If the RPC's term is less than the receiver's current term, the vote is rejected.
    • If the receiver has already voted in the *same* term (or its log is "more up-to-date" - simplified here), it rejects the vote.
    • Otherwise, the receiver grants its vote, updates its `votedFor` record for the term, and resets its *own* election timeout.
  5. Outcome:
    • Wins Election: If the candidate receives votes from a majority of servers, it becomes the new Leader. It immediately starts sending heartbeats (AppendEntries) to establish authority.
    • Another Leader Emerges: If the candidate receives an AppendEntries RPC from another server claiming to be leader *in the current or a higher term*, the candidate recognizes the new leader and reverts to the Follower state.
    • Election Timeout (Split Vote): If a candidate neither wins nor discovers a new leader before its *own* election timer expires (e.g., due to a split vote where no candidate gets a majority), it increments its term and starts a *new* election. Random election timeouts help prevent perpetual split votes.

Log Replication (Simplified)

Paxos

Paxos is another famous consensus algorithm, historically significant and proven correct, but generally considered harder to understand and implement than Raft. It achieves consensus through phases involving "Prepare" and "Accept" messages, proposals, and quorums, without relying on a designated stable leader in the same way Raft does (though leader-based variants exist). Visualizing its message flow accurately is significantly more complex.

Visualize Raft

Configure the cluster, fail/recover nodes, and observe leader elections and heartbeats. (Log replication is shown simply by appending entries).

Raft Cluster State - Global Term: 0 - Leader: None
Log messages will appear here...