% Title = "The Stellar Consensus Protocol (SCP)" % abbrev = "scp" % category = "exp" % docName = "draft-mazieres-dinrg-scp-00" % ipr= "trust200902" % area = "Internet" % workgroup = "" % keyword = ["consensus"] % % date = 2018-03-05T00:00:00Z % % [[author]] % initials="N." % surname="Barry" % fullname="Nicolas Barry" % #role="editor" % organization = "Stellar Development Foundation" % [author.address] % email = "nicolas@stellar.org" % [author.address.postal] % street = "170 Capp St., Suite A" % city = "San Francisco, CA" % code = "94110" % country = "US" % % [[author]] % initials="D." % surname="Mazieres" % fullname="David Mazieres" % #role="editor" % organization = "Stanford University" % [author.address] % email = "dm@uun.org" % [author.address.postal] % street = "353 Serra Mall, Room 290" % city = "Stanford, CA" % code = "94305" % country = "US" % % [[author]] % initials="J." % surname="McCaleb" % fullname="Jed McCaleb" % #role="editor" % organization = "Stellar Development Foundation" % [author.address] % email = "jed@stellar.org" % [author.address.postal] % street = "170 Capp St., Suite A" % city = "San Francisco, CA" % code = "94110" % country = "US" % % [[author]] % initials="S." % surname="Polu" % fullname="Stanislas Polu" % #role="editor" % organization = "Stripe Inc." % [author.address] % email = "stan@stripe.com" % [author.address.postal] % street = "185 Berry Street, Suite 550" % city = "San Francisco, CA" % code = "94107" % country = "US"
.# Abstract
SCP is a protocol for Internet-level Byzantine agreement.
{mainmatter}
Various aspects of Internet infrastructure depend on irreversible and transparent updates to data sets such as authenticated mappings [cite Li-Man-Watson draft]. Examples include public key certificates mapping domain names to public keys, transparency logs [@?RFC6962], HSTS [@?RFC6797] and HPKP [@?RFC7469] preload lists.
The Stellar Consensus Protocol (SCP) specified in this draft allows various Internet infrastructure stakeholders to collaborate in applying irreversible transactions to public state. SCP is an open Byzantine agreement protocol that resists Sybil attacks by allowing individual parties to specify minimum quorum memberships in terms of specific trusted peers. Hence, participants can choose specific peers they know to be trustworthy and enjoy safety so long as the peers they depend on honestly obey the protocol.
An more detailed abstract description of the SCP protocol and English-language proof of safety is available in [@?SCP]. This document specifies the end-system logic and wire format of the messages.
Each participant or node in the SCP protocol has a digital signature
key and is named by the corresponding public key, which we term a
NodeID
.
Each node also selects one or more groups of nodes (each of which includes itself) called quorum slices. A quorum slice represents a large or important enough set of peers that the node selecting the quorum slice believes the slice collectively speaks for the whole network.
A quorum is a non-empty set of nodes containing at least one quorum
slice of each of its members. For instance, suppose v1
has the
single quorum slice {v1, v2, v3}
, and each of v2
, v3
, and v4
has the single quorum slice {v2, v3, v4}
. In this case, {v2, v3, v4}
is a quorum, because it contains a slice for each member. On the
other hand {v1, v2, v3}
is not a quorum, because it does not contain
a quorum slice for v2
or v3
. The smallest quorum including v1
in this example is the set of all nodes {v1, v2, v3, v4}
.
Unlike traditional Byzantine agreement protocols, nodes in SCP only
care about quorums to which they belong themselves (and hence which
contain at least one of their quorum slices). Intuitively, this is
what protects nodes from Sybil attacks. In the example above, if v3
deviates from the protocol maliciously invents 96 Sybils v5, v6, ..., v100
, the honest nodes quorums will all still include one another.
Every message in the SCP protocol specifies the sender's quorum slices. Hence, by collecting messages a node dynamically learns what constitutes a quorum, and can decide when a particular message has been sent by a quorum to which it belongs. (Again, nodes to not care about quorums to which they do not belong themselves.)
The SCP protocol outputs a series of values each associated with a consecutively numbered slot. 5 seconds after SCP outputs the value for one slot, nodes restart the protocol to select a value for the next slot. The time between slots is used to construct candidate values for the next slot, as well as to amortize the cost of consensus over an arbitrary-sized batch of operations.
There must exist a deterministic combining function that reduces multiple candidate values into a single composite value. Hence, should nodes nominate multiple values, they can still deterministically converge on the same composite value. Note that this does not mean that every node has a say in the composite value for every slot. Whether or not a node can in practice influence a given slot depends on how many other nodes include that node in their quorum slices as well as how the current slot number perturbs a cryptographic hash of nodes' public keys.
The protocol consists of exchanging digitally-signed messages
containing nodes' quorum slices. When multiple nodes send the same
messages, two thresholds have significance throughout the protocol for
each node v
.
-
quorum threshold: When
v
node is a member of a quorum in which every member (includingv
) has issued some signed statement. -
blocking threshold: When every one of
v
's quorum slices has a member issuing some signed statement (which doesn't necessarily includev
).
Message formats are specified using XDR [@!RFC4506].
SCP employs 32- and 64-bit integers, as defined below.
typedef unsigned int uint32;
typedef int int32;
typedef unsigned hyper uint64;
typedef hyper int64;
SCP uses the SHA-256 cryptograhpic hash function [@!RFC6234], and represents hash values as a simple array of 32 bytes.
typedef opaque Hash[32];
SCP employes the Ed25519 digital signature algorithm [@!RFC8032]. For purposes of cryptographic agility, however, public keys are represented as a union type that can be compatibly extended with other key types.
typedef opaque uint256[32];
enum PublicKeyType
{
PUBLIC_KEY_TYPE_ED25519 = 0
};
union PublicKey switch (PublicKeyType type)
{
case PUBLIC_KEY_TYPE_ED25519:
uint256 ed25519;
};
// variable size as the size depends on the signature scheme used
typedef opaque Signature<64>;
Nodes are public keys, while values are simply opaque arrays of byes.
typedef PublicKey NodeID;
typedef opaque Value<>;
Theoretically a quorum slice can be an arbitrary set of sets of nodes. However, it would not be possible to represent an arbitrary predicate on sets concisely. Instead we specify quorum slices as any set of k-of-n members, where each of the n members can either be an individual node ID, or, recursively, another k-of-n predicate.
// supports things like: A,B,C,(D,E,F),(G,H,(I,J,K,L))
// only allows 2 levels of nesting
struct SCPQuorumSet
{
uint32 threshold; // the n in k-of-n
PublicKey validators<>;
SCPQuorumSet innerSets<>;
};
Nomination message are sent in the first phase of each slot in an attempt to select a value on which to try to agree.
struct SCPNomination
{
Hash quorumSetHash; // D
Value votes<>; // X
Value accepted<>; // Y
};
As with all messages, the nomination protocol includes a SHA-256 hash of the sender's XDR-encoded SCPQuorumSet. It also includes two monotonically-growing sets of values.
votes
consists of candidate values nominated by the sender. Each
node progresses through a series of nomination rounds in which it
potentially adds more and more values to votes
by adding votes
for
valid values received from a growing set of peers to its own set of
votes
. Each node computes the set of peers whose nomination votes
it should echo based on its quorum slices as follows for round n
of
slot i
:
-
Let
Gi(m) = SHA-256(i || output[i-1] || m)
, whereoutput[i-1]
is the consensus output of slot i-1 or the zero-byte value for slot 0. (Recall values are encoded as an XDR opaque vector, with a 32-byte length followed by bytes padded to a multiple of 4 bytes.) Treat the output ofGi
as a 256-bit binary number in big-endian format. -
For each peer
v
, defineweight(v)
as the faction of quorum slices containingv
. -
Define the set of nodes
neighbors(n)
as the set of nodes v for whichGi("N" || n || v) < 2^{256}-1 * weight(v)
. -
Define
priority(n, v)
asGi("P" || n || v)
.
For each round n
until nomination starts, a node picks the available
peer v
with the highest value of priority in the given round from
among the nodes in neighbors(n)
, and adds any values in that node's
votes
set to its own votes
set. Round n
lasts for 2+n
seconds, after which if nomination has not finished (see below), a
node proceeds to round n+1
.
If a particular valid value x
reaches quorum threshold in the
messages sent by peers (meaning that every node in a quorum contains
x
either in the votes
or the accepted
field), then a node moves
x
from its votes
field to its accepted
field and broadcasts a
new SCPNomination
message. Similarly, if x
reaches blocking
threshold in a node's peers' accepted
field (meaning every one of a
node's quorum slices contains at least one node with x
in its
accepted
field), then the node adds x
to its accepted
field
(removing it from votes
if applicable).
The nomination process terminates at a node when any value x
reaches
quorum threshold in the accepted
fields, meaning every node in a
quorum that includes the node has sent a signed SCPNomination
message in which the accepted
field contains x
. Once nomination
terminates, nodes stop adding new values to their votes
sets, but
potentially continues adding new values to accepted
as previously
described.
Once the nomination process is complete, meaning at least one
accepted
value has reached quorum threshold, a node moves on to the
balloting phase of the protocol. A ballot is a pair of a counter
(n
) and candidate value (x
):
struct SCPBallot
{
uint32 counter; // n
Value value; // x
};
The counter beings at 1, and is incremented to higher numbers if
ballot 1 fails to reach consensus on an output value. The value x
begins as the output of the deterministic combining function on all
values that have achieved quorum threshold in the accepted
fields of
SCPNomination
messages (meaning all values for which the local node
is part of a quorum in which every node includes the value in the
accepted
field of an SCPNomination
message). Note, however, that
x
may change in subsequent ballots if more values reach this quorum
threshold (since nomination keeps running in the background, even
though a node does not add new values to its votes
field in
SCPNomination
). Moreover, as described below, x
may change if a
particular ballot makes it a certain way through the protocol before
failing.
Once they have selected x
for a particular ballot, nodes attempt to
prepare
the ballot by sending the following message:
struct SCPPrepare
{
Hash quorumSetHash; // D
SCPBallot ballot; // b
SCPBallot* prepared; // p
SCPBallot* preparedPrime; // p'
uint32 nC; // c.n
uint32 nH; // h.n
};
The fields have the following meaning:
-
quorumSetHash
- the SHA-256 hash of the sending node'sSCPQuorumSet
structure. -
ballot
- the current ballot (see below) -
prepared
- the highest ballot<n,x>
for which one of the following two thresholds has been met:- The sending node is part of a quorum in which each node included
<n',x>
forn' >= n
in one of theballot
,prepared
orpreparedPrime
fields of anSCPPrepare
message. - Every one of the sending node's quorum slices contains at least
one node that sent an
SCPPrepare
containing some<n',x>
withn' >= n
in either theprepared
orpreparedPrime
field. If no such ballot exists, thenprepared
isNULL
.
- The sending node is part of a quorum in which each node included
-
preparedPrime
- the highest ballot<n,x>
that satisfies the same criteria asprepared
, but has a different valuex
from that in theprepared
field.NULL
if no such ballot exists. -
nH
- the counter from the highest ballot<n,x>
for which the sender is in a quorum in which each member has sent anSCPPrepare
message with one of the following properties (ornH = 0
if no such ballot exists):- The
prepared
field contains<n',x>
wheren' >= n
, or - The
preparedPrime
field contains<n',x'>
wheren' >= n
andx'
can be any value.
- The
-
nC
- the counter for the lowest ballot the sender is attempting to confirm (see below), otherwise 0.
The value
field x
of each ballot is selected as follows. If nH = 0
, then use the deterministic combination function on all values that
have made it all the way through the nomination protocol to produce a
candidate value. Otherwise, use the value x
associated with the
ballot that produced nH
.
The counter
field n
of each ballot is selected as follows.
-
Initially,
n = 1
. -
If a node is still sending
SCPPrepare
orSCPConfirm
messages (meaning it has not yet output a value for a particular slot), then a node arms a timer whenevern
reaches quorum threshold of ballots (meaning the node is a member of a quorum in which each member is sending explicit or implicitSCPPrepare
messages with ballotcounter
>=n
). -
If the timer fires, a node increments
n
and determines a new value depending on the latest state of the nomination protocol andnH
, as discussed above. -
If a
counter
value greater thann
ever reaches a blocking threshold, then a node immediately disables any pending timer and increasesn
to the blockingcounter
value, recomputingvalue
in the usual way.
The value nC
is maintained based on an internally-maintained "commit
ballot" c
, where initially c = cNULL = <0, NULL>
(for some
arbitrary or invalid value NULL
).
-
If either
prepared
orpreparedPrime
hascounter
greater thanc
's and a differentvalue
, then resetc = cNULL
. -
If
c = cNULL
and the ballot determiningnH
hasn't been aborted byprepared
orpreparedPrime
, then setc
to the lowest ballot containing the value of that (nH
) ballot that is betweenballot
and thenH
ballot. -
If some ballot with the same value reaches quorum threshold betwen
c
and thenH
ballot, move to the confirm phase.
struct SCPConfirm
{
SCPBallot ballot; // b
uint32 nPrepared; // p.n
uint32 nCommit; // c.n
uint32 nH; // h.n
Hash quorumSetHash; // D
};
This message conveys an implicit SCPPrepare
message with the
following fields:
quorumSetHash
- sameballot
-<infinity, x>
wherex
is the value in theSCPConfirm message's
ballot`.prepared
- samepreparedPrime
-NULL
nC
- samenH
-infinity
(Note that infinity
here just represents 2^{32} -- an integer one
greater than the largest representable value on the wire.)
struct SCPExternalize
{
SCPBallot commit; // c
uint32 nH; // h.n
Hash commitQuorumSetHash; // D used before EXTERNALIZE
} externalize;
An SCPExternalize
message conveys two implicit SCPConfirm
messages. The first one has the following fields:
ballot
-<infinity,x>
wherex
is the value from theSCPExternalize
ballot.nPrepared
-infinity
nCommit
- thecounter
field from theSCPExternalize
messagescommit
field.nH
-infinity
quorumSetHash
- thecommitQuorumSetHash
from theSCPExternalize
message.
The second one has the same fields as the first, except for the last two:
nH
- thenH
value from theSCPExternalize
messagequorumSetHash
- a trivialSCPQuorumSet
that contains a single slice whose only member is the sender of theSCPExternalize
message
In order to provide full context for each signed message, all signed
messages are part of an SCPStatement
union type that includes the
slotIndex
naming the slot to which the message applies, as well as
the type
of the message. A signed message and its signature are
packed together in an SCPEnvelope
structure.
enum SCPStatementType
{
SCP_ST_PREPARE = 0,
SCP_ST_CONFIRM = 1,
SCP_ST_EXTERNALIZE = 2,
SCP_ST_NOMINATE = 3
};
struct SCPStatement
{
NodeID nodeID; // v (node signing message)
uint64 slotIndex; // i
union switch (SCPStatementType type)
{
case SCP_ST_PREPARE:
SCPPrepare prepare;
case SCP_ST_CONFIRM:
SCPConfirm confirm;
case SCP_ST_EXTERNALIZE:
SCPExternalize externalize;
case SCP_ST_NOMINATE:
SCPNomination nominate;
}
pledges;
};
struct SCPEnvelope
{
SCPStatement statement;
Signature signature;
};
If nodes do not pick quorum slices well, the protocol will not be safe.
The Stellar development foundation supported development of the protocol and produced the first production deployment of SCP. The IRTF DIN group including Dirk Kutscher, Sydney Li, Colin Man, Melinda Shore, and Jean-Luc Watson helped with the framing and motivation for this specification.
{{reference.scp.xml}}
{backmatter}