This post covers an interesting paper released last week by Heidi Howard and Richard Mortier: Paxos vs Raft: Have we reached consensus on distributed consensus?
Why would you want to read this paper
If you want to have an example of how to explain (or understand) Paxos easily, I believe it’s one of the best attempts I have read.
For two reasons.
The use of a Raft-like language removes a lot of abstraction, which makes it easier to understand. And the focus on the differences between Paxos and Raft help the subtle specifics of Paxos.
In the next few paragraphs, I’m going to highlight the key ideas from this paper. Follow the link to the full paper is at the bottom to read all the details.
In the last decades, we used distributed systems more and more. And these distributed systems often require some distributed consensus to provided consistency guarantees. The main two algorithms we see uses in the industry are almost always either Paxos, or Raft.
Paxos is the classic standard solution. Leslie Lamport published the Part Time Parliament paper (aka Paxos paper) in 1998 which then generated a huge amount of research, improvements and refinements.
Paxos is used in many important systems, like Cassandra, Google uses it in Spanner and Chubby, which powers BigTable, Spanner, AWS’s DynamoDB, Ceph storage, …
Diego Ongaro and John Ousterhout published the In Search of an Understandable Consensus Algorithm paper much more recently, in 2014. Raft claims to be as efficient as Paxos, while being more understandable.
Some very popular projects rely on Raft: CoreOS’s etcd, one of the key components of Kubernetes, HashiCorp’s Consul and Vault, RethinkDB or CockroachDB, …
In other words: Raft solves a problem already solved by Paxos 16 years ago, yet it gained a lot of momentum over the last years. Why is it so successful? Is the algorithm much better or not?
Howard and Mortier’s paper tries to answer this by decomposing this into a few questions:
- how can we compare Paxos and Raft?
- what are the key differences between them
- is one of them better than the other to address the distributed consensus problem?
They found that Raft is very popular for 3 main reasons
The Raft paper has a pragmatic, easy to understand presentation
Raft is easy to understand, because it’s brilliantly presented.
The algorithm is not significantly easier to understand, but the presentation of the algorithm makes it easy to understand.
And to prove their point, Howard and Mortier present the (Multi-)Paxos algorithm as if it were presented like Raft, and show that both are very similar and similarly easy to grasp, once presented without needless abstractions.
Simplicity and understandability
Ongaro and Ousterhout tried to minimize the number of messages.
Raft does not allow a log entry to be written before being committed, which Paxos allows.
Paxos provides out-of-order writes, but this makes the algorithm a little bit harder to understand.
Raft’s requirements simplify this: a log entry for an operation, identified with an index in a term, will be unique and won’t change.
These trade-offs make the algorithm simpler, and slightly - but not significantly - more undertsandable.
The paper does a great job in highlighting how the language used to explain the algorithm goes a long way at making it easier to understand:
The RequestVote RPCs are often referred to as phase1a and phase1b messages, prepare request and response or prepare and promise messages.
The AppendEntries RPCs are often referred to as phase2a and phase2b messages, accept request and response or propose and accept messages.
To summarize this quote: Raft have fewer messages, and the name used explain the intent very well. RequestVote and AppendEntries RPCs are abstract name and convey the intent. Actually, these messages are described in just a single column of the Raft paper.
Raft takes a novel approach at leader election
The Raft algorithm is slightly different (and simpler) than Paxos’s leader election:
One of the main differences between them is how Raft simplifies the leader election algorithm, where only a candidate with all the latest operations can be elected.
Overall, compared to Paxos, Raft’s approach to leader election is surprisingly efficient for such a simple approach.
Here I’m going to just quote most of the summary of the paper, because it’s really short and straight to the point:
The Raft algorithm was proposed to address the longstanding issues with understandability of the widely studied Paxos algorithm. In this paper, we have demonstrated that much of the understandability of Raft comes from its pragmatic abstraction and excellent presentation.
By describing a simplified Paxos algorithm using the same approach as Raft, we find that the two algorithms differ only in their approach to leader election.
The Raft paper claims that Raft is significantly more understandable than Paxos, and as efficient. On the contrary, we find that the two algorithms are not significantly different in understandability.
When I read this paper, it immediately reminded me of a talk by Armon Dadgar, HashiCorp’s CTO, comparing Raft and Paxos and why Raft’s simplicity was a key feature to build large scale production systems.
The segment of this talk is only 5 min long, and Armon’s usual playful tone makes this quest to implement a distributed consensus production-ready system very interesting:
Leaving the Ivory Tower: Research in the Real World compares Paxos and Raft and why HashiCorp choose Raft for their products, in a very pragmatic and thoughtful way. He brilliantly shows why the pragmatic and simpler approach and presentation of Raft was indeed key to its success.
The part related to Paxos and Raft spans between 30:24 and 35:08. I highly recommend it! (both this part, and the whole talk)
Here are a few resources if you want to continue on this:
You can have a glimpse into the Paxos vs Raft paper with a quick overview of the paper presented in a video by Heidi Howard herself:
Neat Algorithms - PAXOS is an illustrated guide into how Paxos works.
Obviously, you can read the main paper we talked about: Howard and Mortier, “Paxos vs Raft: Have we reached consensus on distributed consensus?”
Lamport, “The Part-Time Parliament.” (the original Paxos paper) by Leslie Lamport
Raft - An Understandable Consensus Algorithm: Ongaro and Ousterhout, “In Search of an Understandable Consensus Algorithm.”