This is an implementation of the Speculative Paxos protocol, as described in the paper "Designing Distributed Systems Using Approximate Synchrony in Datacenter Networks" from NSDI 2015.
Speculative Paxos is a state machine replication protocol based on the idea of co-designing distributed systems with the datacenter network. It offers excellent performance in environments where replicas receive most requests in the same order. The NSDI paper describes how to implement a "Mostly-Ordered Multicast" (MOM) primitive that provides this property with high probability.
In these environments, Speculative Paxos is able to complete operations without coordination. This allows it to achieve the theoretical minimum latency (one round trip from client to replicas) and high throughput, as replicas do not need to exchange messages in the normal case.
This repository contains implementations of 3 replication protocols:
-
Speculative Paxos, including normal operation, synchronization, and reconciliation protocols.
-
Viewstamped Replication (VR), aka Multi-Paxos, as described in the paper "Viewstamped Replication Revisited", including an optional batching optimization
-
Fast Paxos, although only the normal case is implemented.
-
A simple unreplicated RPC protocol for comparison
...as well as three applications:
-
a "null RPC" benchmark that executes operations as quickly as possible
-
a simple timestamp server, as might be used in a distributed storage system
-
a transactional key-value store ("nistore") that supports multiversioning and concurrency control using either strict two-phase locking or optimistic concurrency control. (This is derived from the codebase used in the TAPIR project.
Speculative Paxos and the applications can be built using make
. It
has been tested on Ubuntu 14.04, 16.04 and Debian 8. Regression tests
can be run with make check
Dependencies include (Debian/Ubuntu packages): protobuf-compiler pkg-config libunwind-dev libssl-dev libprotobuf-dev libevent-dev libgtest-dev
You will need to create a configuration file with the following syntax:
f <number of failures tolerated>
replica <hostname>:<port>
replica <hostname>:<port>
...
You can then start a replica with ./bench/replica -c <path to config file> -i <replica number> -m <mode>
, where mode
is either:
spec
for Speculative Paxosvr
for Viewstamped Replicationfastpaxos
for Fast Paxosunreplicated
for no replication (uses only the first replica)
The vr
mode also accepts a -b
option to specify the maximum batch
size.
To run a single client, use ./bench/replica -c <path to config file> -m <mode>
.
For performance measurements, you will likely want to add -DNASSERT
and -O2
to the CFLAGS
in the Makefile, and run make PARANOID=0
,
which disables complexity-changing assertions.
Speculative Paxos is a product of the UW Systems Lab. Please email Dan Ports at drkp@cs.washington.edu with any questions.