- Status
- What is MinBFT
- Why MinBFT
- Concepts
- Quick Start
- Requirements
- Getting Started
- Roadmap
- Contributing
- License
This project is in experimental development stage. It is not suitable for any kind of production use. Interfaces and implementation may change significantly during the life cycle of the project.
MinBFT is a pluggable software component that allows to achieve Byzantine fault-tolerant consensus with fewer consenting nodes and less communication rounds comparing to the conventional BFT protocols. The component is implemented based on Efficient Byzantine Fault-Tolerance paper, a BFT protocol that leverages secure hardware capabilities of the participant nodes.
This project is an implementation of MinBFT consensus protocol. The code is written in Golang, except the TEE part which is written in C, as an Intel® SGX enclave.
Byzantine fault-tolerant (BFT) protocols are able to achieve high transaction throughput in permission-based consensus networks with static configuration of connected nodes. However, existing components such as Practical Byzantine Fault Tolerance (PBFT) consensus plugin still incur high communication cost. Given the pervasive Trusted Execution Environments (TEE) on commodity platforms, we see a strong motivation to deploy more efficient BFT protocols that leverage the TEE applications as trust anchors even on faulty nodes.
TEEs are pervasive nowadays and supported by many commodity platforms.
For instance, Intel® SGX is being deployed on many PCs and servers,
while mobile platforms are mostly supported by Arm Trustzone. TEEs
rely on secure hardware to provide data protection and code isolation
from a compromised hosting system. Here, we propose such a consensus
component that implements MinBFT, a BFT protocol that leverages TEEs
to prevent message equivocation, and thus reducing the required number
of consenting nodes as well as communication rounds given the same
fault-tolerance threshold. More specifically, it requires only 2f+1
consenting nodes in order to tolerate f
faulty nodes (or tolerate up
to half of faulty nodes); meanwhile, committing a message requires
only 2 rounds of communication among the nodes instead of 3 rounds as
in PBFT.
We hope that by evaluating this consensus component under the existing blockchain frameworks, the community will benefit from availability to leverage it in different practical use cases.
The consensus process in MinBFT is very similar to PBFT. Consenting
nodes (i.e., nodes who vote for ordering the transactions) compose a
fully-connected network. There is a leader (often referred to as
primary node) among the nodes who first prepares an incoming request
message to the other nodes by suggesting a sequence number for the
request in broadcasted PREPARE
messages. The other nodes verify the
PREPARE
messages and subsequently broadcast a COMMIT
message in
the network. Finally, the nodes who have received f+1
(where f
is
fault-tolerance, a number of tolerated faulty nodes) consistent
COMMIT
messages execute the request locally and update the
underlying service state. If the leader is perceived as faulty, a view
change procedure follows to change the leader node.
Note that the signatures for each PREPARE
and COMMIT
messages are
generated by USIG (Unique Sequential Identifier Generator) service,
the tamper-proof part of the consenting nodes. The sequence numbers
are also assigned to the messages by USIG with the help of a unique,
monotonic, and sequential counter protected by TEE. The signature,
also known as USIG certificate, certifies the counter assigned to a
particular message. The USIG certificate combined with the counter
value comprises a UI (unique identifier) of the message. Since the
monotonic counter prevents a faulty node from sending conflicting
messages to different nodes and provides correct information during
view changes, MinBFT requires less communication rounds and total
number of consenting nodes than PBFT.
For more detailed description of the protocol, refer to Efficient Byzantine Fault-Tolerance paper.
This quick start shows you how to run this project using Docker containers. This is the easiest way to try out this project with minimal setup. If you want to build and run without Docker, skip this section.
To run the containers, the following software must be installed on your system.
- Docker Engine (tested with version 19.03.8)
- Docker Compose (tested with version 1.25.0)
If you are using Ubuntu 20.04, they can be installed as follows:
sudo apt-get install docker.io docker-compose
Note that SGX-enabled CPU is not required to run the containers; they will run in the simulation mode provided by Intel SGX SDK. We plan to have another container image which runs in the HW mode (i.e. using "real" hardware features) in a future release.
Build an image of the containers as follows:
sudo docker build -f sample/docker/Dockerfile -t minbft .
Note that, by defaut, the docker
command needs to be executed by
root
. Refer to Docker's
document
for details.
If your system has the make
command, the following command also can be
used.
sudo make docker
To start up an example consensus network of replicas, invoke the following commands:
sudo UID=$UID docker-compose -f sample/docker/docker-compose.yml up -d
This will start the replica nodes as 3 separate containers in background.
Requests can be submitted for ordering and execution to the example consensus network as follows:
sudo docker-compose -f sample/docker/docker-compose.yml \
run client request "First request" "Second request" "Another request"
This command should produce the following output showing the result of ordering and execution of the submitted requests:
Reply: {"Height":1,"PrevBlockHash":null,"Payload":"Rmlyc3QgcmVxdWVzdA=="}
Reply: {"Height":2,"PrevBlockHash":"DuAGbE1hVQCvgi+R0E5zWaKSlVYFEo3CjlRj9Eik5h4=","Payload":"U2Vjb25kIHJlcXVlc3Q="}
Reply: {"Height":3,"PrevBlockHash":"963Kn659GbtX35MZYzguEwSH1UvF2cRYo6lNpIyuCUE=","Payload":"QW5vdGhlciByZXF1ZXN0"}
The output shows the submitted requests being ordered and executed by a sample blockchain service. The service executes request by simply appending a new block for each request to the trivial blockchain maintained by the service.
The following command can be used to terminate running containers:
sudo docker-compose -f sample/docker/docker-compose.yml down
The containers create some files while running. These files can be deleted as follows.
rm -f sample/docker/keys.yaml sample/docker/.keys.yaml.lock
The Docker image can be deleted as follow.
sudo docker rmi minbft
If your system has the make
command, the following command also can be
used.
sudo make docker-clean
The project has been tested on Ubuntu 18.04 and 20.04 LTS. Additional required packages can be installed as follows:
sudo apt-get install build-essential pkg-config
Go 1.13 or later is required to build this project (tested against 1.13, 1.15 and 1.16). Official installation instructions can be found here.
The Intel® SGX enclave implementation has been tested with Intel® SGX SDK for Linux version 2.12. For installation instuctions please visit download page. Please note that Intel SGX has two operation modes and required software components depend on operation mode.
- If you run in HW mode, you have to install all three components: SGX driver, PSW, and SGX SDK.
- If you run in simulation mode, only SGX SDK is required.
A conventional directory to install
the SDK is /opt/intel/
. Please do not forget to source
/opt/intel/sgxsdk/environment
file in your shell. Alternatively, one
can add the following line to ~/.profile
:
. /opt/intel/sgxsdk/environment
If you run in simlation mode, you need create/update the link to the additional directory of shared libraries with following commands:
sudo bash -c "echo /opt/intel/sgxsdk/sdk_libs > /etc/ld.so.conf.d/sgx-sdk.conf"
sudo ldconfig
When using a machine with no SGX support, only SGX simulation mode is
supported. In that case, please be sure to export the following
environment variable, e.g. by modifying ~/.profile
file:
export SGX_MODE=SIM
This is a Go module and can be placed anywhere; no need to be in
GOPATH. If this is placed in GOPATH and you are using Go 1.11 or 1.12,
please make sure that the environment variable GO111MODULE=on
has
set to activate module mode.
All following commands are supposed to be run in the root of the module's source tree.
The project can be build by issuing the following command. At the
moment, the binaries are installed in sample/bin/
directory; no root
privileges are needed.
make install
Running the example requires some set up. Please make sure the project
has been successfully built and sample/bin/keytool
and
sample/bin/peer
binaries were produced. Those binaries can be
supplied with options through a configuration file, environment
variables, or command-line arguments. More information about available
options can be queried by invoking those binaries with help
argument. Sample configuration files can be found in
sample/authentication/keytool/
and sample/peer/
directories
respectively.
Before running the example, the environment variable $LD_LIBRARY_PATH
needs to include sample/lib
where libusig_shim.so
is installed by
make install
.
export LD_LIBRARY_PATH="${PWD}/sample/lib:${LD_LIBRARY_PATH}"
The following command are to be run from sample
directory.
cd sample
Sample key set file for testing can be generate by using keytool
command. This command produces a key set file suitable for running the
example on a local machine:
bin/keytool generate -u lib/libusig.signed.so
This invocation will create a sample key set file named keys.yaml
containing 3 key pairs for replicas and 1 key pair for a client by
default.
Consensus options can be set up by means of a configuration file. A sample consensus configuration file can be used as an example:
cp config/consensus.yaml ./
Peer configuration can be supplied in a configuration file. Selected
options can be modified through command line arguments of peer
binary. A sample configuration can be used as an example:
cp peer/peer.yaml ./
To start up an example consensus network of replicas on a local machine, invoke the following commands:
bin/peer run 0 &
bin/peer run 1 &
bin/peer run 2 &
This will start the replica nodes as 3 separate OS processes in background using the configuration files prepared in previous steps.
Requests can be submitted for ordering and execution to the example
consensus network using the same peer
binary and configuration files
for convenience. It is better to issue the following commands in
another terminal so that the output messages do not intermix:
bin/peer request "First request" "Second request" "Another request"
This command should produce the following output showing the result of ordering and execution of the submitted requests:
Reply: {"Height":1,"PrevBlockHash":null,"Payload":"Rmlyc3QgcmVxdWVzdA=="}
Reply: {"Height":2,"PrevBlockHash":"DuAGbE1hVQCvgi+R0E5zWaKSlVYFEo3CjlRj9Eik5h4=","Payload":"U2Vjb25kIHJlcXVlc3Q="}
Reply: {"Height":3,"PrevBlockHash":"963Kn659GbtX35MZYzguEwSH1UvF2cRYo6lNpIyuCUE=","Payload":"QW5vdGhlciByZXF1ZXN0"}
The output shows the submitted requests being ordered and executed by a sample blockchain service. The service executes request by simply appending a new block for each request to the trivial blockchain maintained by the service.
The following command can be used to terminate running replica processes and release the occupied TCP ports:
killall peer
The above example shows a simple normal case of the consensus network. Our next interest is how the system behaves when some replicas are faulty.
The simplest faulty case is crash fault on backup replicas. Note that we don't tolerate any type of fault on primary replica until view change operation is implemented.
Let's restart the network, and note the process IDs of each replica process.
$ bin/peer run 0 &
[1] 16899
$ bin/peer run 1 &
[2] 16916
$ bin/peer run 2 &
[3] 16923
Make sure that all replicas are properly working by sending a request:
$ bin/peer request First request
Reply: {"Height":1,"PrevBlockHash":null,"Payload":"Rmlyc3QgcmVxdWVzdA=="}
Now kill replica 1 and send another request:
$ kill 16916
$ bin/peer request Second request
Reply: {"Height":2,"PrevBlockHash":"DuAGbE1hVQCvgi+R0E5zWaKSlVYFEo3CjlRj9Eik5h4=","Payload":"U2Vjb25kIHJlcXVlc3Q="}
OK, we still get the reply messages with successfully agreed response. Next, kill another backup replica and send another request:
$ kill 16923
$ bin/peer request Another request
(no response)
We fail to reach consensus and get no response because more than
f
replicas are faulty.
The code divided into core consensus protocol implementation and sample implementation of external components required to interact with the core. The following directories contain the code:
api/
- definition of API between core and external componentsclient/
- implementation of client-side part of the protocolcore/
- implementation of core consensus protocolusig/
- implementation of USIG, tamper-proof componentmessages/
- definition of the protocol messagessample/
- sample implementation of external interfacesauthentication/
- generation and verification of authentication tagskeytool/
- tool to generate sample key set file
conn/
- network connectivityconfig/
- consensus configuration providerrequestconsumer/
- service executing ordered requestspeer/
- CLI application to run a replica/client instance
The following features of MinBFT protocol has been implemented:
- Normal case operation: minimal ordering and execution of requests as long as primary replica is not faulty
- SGX USIG: implementation of USIG service as Intel® SGX enclave
The following features are considered to be implemented:
- View change operation: provide liveness in case of faulty primary replica
- Garbage collection and checkpoints: generation and handling of
CHECKPOINT
messages, log pruning, high and low water marks - USIG enclave attestation: support to remotely attest USIG Intel® SGX enclave
- Faulty node recovery: support to retrieve missing log entries and synchronize service state from other replicas
- Request batching: reducing latency and increasing throughput by combining outstanding requests for later processing
- Asynchronous requests: enabling parallel processing of requests
- MAC authentication: using MAC in place of digital signature in USIG to reduce message size
- Read-only requests: optimized processing of read-only requests
- Speculative request execution: reducing processing delay by tentatively executing requests
- Documentation improvement: comprehensive documentation
- Testing improvement: comprehensive unit- and integration tests
- Benchmarks: measuring performance
Everyone is welcome to contribute! There are many ways to make useful contribution. Please look at Contribution Guideline for more details.
Source code files are licensed under the Apache License, Version 2.0.
Documentation files are licensed under the Creative Commons Attribution 4.0 International License.