Skip to content

Commit

Permalink
feat(validator): elections powered by etcd leases (#31)
Browse files Browse the repository at this point in the history
* feat(validator): elections powered by etcd leases

Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>

* docs: etcd leases, state and deployment diagram

Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>

* fix(quorum): docker-compose network name typo

It was copy-pasted from the fabric docker-compose and so the network names
were incorrectly referencing the fabric network, not the quorum.

Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>

* refactor(validator): change getter to method

The getter that was changed: isCurrentNodeLeader

Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>

* refactor(validator): Object.assign -> line by line

TODO(peter.somogyvari): Once leader election
has enough test coverage, get rid of these property assignments
and just use the this.leaderNodeInfo.networkInfo
object directly everywhere.

Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>

* fix(test/validator): isCurrentNodeLeader syntax

Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>

* fix(examples,tools): use local npm package

The examples were hardcoding GitHub branch references
which is a moving target and makes it error prone in certain
situations to have the intended code running for local tests/CI.
Solved it with a pre-install npm script that runs npm pack in the
root directory of the project creating an installable tarball of the
latest source code that's checked out at the current moment.
This strongly ties the examples of any given revision to the
exact source code of BIF from that revision as well, guaranteeing
that whoever/wherever does an npm install inside the examples
will get the current source code of that revision packaged up as
an npm dependency.
that they are committed with

Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>

* fix(example/quorum): container resource usage

Reverting an attempted fix that was breaking the CI.
The attempted fix was to introduce resource constraints
on the quorum containers (CPU, RAM usage). This worked
in the sense that on a smaller dev laptop launching the
quorum network didn't cause OOM, but on the other hand
containers got stuck in 'unhealthy' status on CI so
overall the fix didn't pan out.

Also: Performance optimization is added by reducing logging
verbosity within the quorum containers and the geth syncmode
is being changed from 'full' to 'fast' (the default) which
does not verify each block one by one upon initialization.

Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>

* fix(examples/corda): adds etcd to docker-compose

This was forgotten from the original commit which only
updated fabrid and quorum

Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>

* docs(tools): add docs for create local npm packge

Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
  • Loading branch information
petermetz authored Dec 2, 2019
1 parent 4720b94 commit 61aab4a
Show file tree
Hide file tree
Showing 19 changed files with 2,154 additions and 475 deletions.
1 change: 1 addition & 0 deletions .eslintignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
node_modules
test
.tmp/
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -32,3 +32,4 @@ typings/
examples/simple-asset-transfer/fabric/**/hfc-key-store/

bin/
.tmp/
8 changes: 8 additions & 0 deletions .npmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
.github/
.tmp/
.vscode/
bin/
docs/
examples/
testes/
tools/
46 changes: 46 additions & 0 deletions docs/architecture/leader-election/deployment-diagram.puml
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
@startuml leader-election-etcd-leases-deployment-caption
!include <tupadr3/common>
!include <office/Servers/database_server>
!include <office/Servers/application_server>

center header
Hyperledger Blockchain Integration Framework

endheader

title
<u>Leader Election Deployment Diagram</u>

end title

center footer Hyperledger Blockchain Integration Framework, 2019

frame BIF {

frame Etcd_Cluster as ec {
OFF_DATABASE_SERVER(etcd1,"Etcd 1")
OFF_DATABASE_SERVER(etcd2,"Etcd 2")
OFF_DATABASE_SERVER(etcdn,"Etcd N")
}

frame Validator_Cluster as vc {
OFF_APPLICATION_SERVER(bvn1,"Validator 1")
OFF_APPLICATION_SERVER(bvn2,"Validator 2")
OFF_APPLICATION_SERVER(bvnn,"Validator N")
}

}

bvn2 <~~> bvnn
bvn1 <~> bvn2
bvn1 <~> bvnn

etcd1 <~~> etcd2
etcd1 <~> etcdn
etcd2 <~> etcdn

bvn1 <=[#blue]=> ec
bvn2 <=[#blue]=> ec
bvnn <=[#blue]=> ec

@enduml
22 changes: 22 additions & 0 deletions docs/architecture/leader-election/etcd-leases.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Leader Election via Etcd's Distributed Synchronization Primitives

## Summary

Validators nodes self-elect a leader by leveraging the Etcd API's `watch` and `lease` features. BIF depends on Etcd API version 3, but other than that it's able to work with any Etcd cluster or single node configuration as long as connectivity is achievable.

At startup, each node attempts to make themselves leader, which will only succeed if there is no current leader.

If a validator node fails to become leader, it starts acting as a follower.
The identity of the leader node is stored under a specific key in Etcd on which the leader holds the lease.

Followers are watching the key storing the leaders identity and react to changes as soon as the change is detected, there is no polling that would waste network/compute resources.

Stability of the leader election is guaranteed by the Raft algorithm that Etcd uses under the hood.

## Deployment Diagram

![leader-election-etcd-leases-deployment-caption](https://www.plantuml.com/plantuml/png/0/bPDHIyCm58NVyokkyqN1CbUVX6siSnqEn8cJlOYCBBtJOfgKD9qCqT_kJLPhf-Am5Cev-UxDIT8C2ikDBJC94dc29a29mgPQ1MX54f1PO14ac4kzoL3PGF3S3RE3L0bP9WXTM-OyCMTjeRDCgtvZHAzMgS3s3CqQJT5EkELBwhSelF47oVDSfeAxYMgO2PeU3JpvdEnoawEHc3oIDPHQF8iddYgO4FDeV2MC3S_mHPjdnb0bLHspgPN80Bfb_yfR45TBXb6zJ1YbdDfatNRPzzMmBViCiTBQVVuJuWJ2qyuvOojdm70oXbT6CROofirUNCYoS5rv0IXe5EYPZiUBKNGN3QDPl9Z5j_FuziYTJEUavMgWqph-amihBjp3gOgxzjpRLx8vboaTd3RDUEjclEZcvcfo4TrDfjUV7PThHG7hqfsKl-DX4m_tugg9rvdfTQsW-_xU1qSvsI7fLRYZ51shsySjxBUgDhPQUHqsTDMWTt-ub2K-zCWNrOm_FFNTOmFwZUYYVG00 "leader-election-etcd-leases-deployment-caption")

## Validator Node State Diagram

![state-diagram-validator](https://www.plantuml.com/plantuml/png/0/XL9DRy8m3BtdL_WsQLgrZziGGkm3j4re5v1sM7R86WCHgLsbPen_FxUbG4yxrNrvp-_PoRWIbsHRHD12CFF1hP8hiXyNWtV2oHW94X4i3RUZ6JgF2IOHSmbCCAyryDngXjVRaIMpP1RbM7hPbvWY-fN-FKRED_dQ1O9N4bHwev-g37USDbTmTtDxRypdvHTasQXkd0IzENmRxCcHhpDXXmxWdJs-pQ5Cd6DL6NEam00UHB0efG9Xu6zHQqlgIm9g7D5LQ8cNC97SmmRtffrj07DKJOUgs184kQY0Trfu95t7tamvHjxLz0yd-HfRXQLM0Xvr1KKWjOXDMzKVjMSfTQfJfoQJIYdeu1qCvuDtdCLY1lXRXeI7rF-nUexTe2shMOaQddEKj6ZoE-b5wUETTHyrfxevqXirPepazOtz0G00 "state-diagram-validator")
25 changes: 25 additions & 0 deletions docs/architecture/leader-election/state-diagram-validator.puml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
@startuml state-diagram-validator

title \n<u>Leader Election State Diagram</u>\n
footer \nHyperledger Blockchain Integration Framework, 2019

[*] --> Started
Started --> Candidate
Started: NodeJS process

Candidate : Attempts to obtain\ngrant on lease\nof Etcd key
Leader: Sets Etcd key to\n it's own identity
Follower: Watches Etcd\nkey to determine\nleader's identity

Candidate -> Follower: lease denied
Follower -> Candidate: lease TTL expire
Leader -> Candidate: lease TTL expire
Candidate -> Leader: lease granted

Candidate --> Terminated
Follower --> Terminated
Leader --> Terminated

Terminated --> [*]

@enduml
7 changes: 2 additions & 5 deletions examples/simple-asset-transfer/app.js
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
const Validator = require(`@hyperledger-labs/blockchain-integration-framework`).Validator;
const { Validator } = require(`@hyperledger-labs/blockchain-integration-framework`);
const { genKeyFile } = require(`@hyperledger-labs/blockchain-integration-framework`).cryptoUtils;
const ConnectorFabric = require(`./fabric/connector`);
const ConnectorQuorum = require(`./quorum/connector`);
Expand All @@ -7,14 +7,11 @@ const ConnectorCorda = require(`./corda/connector`);
(async () => {
const keypair = await genKeyFile(`/federation/keypair`);
const validatorOptions = {
etcdHosts: process.env.ETCD_HOSTS.split(','),
clientRepAddr: process.env.CLIENT_REP_ADDR,
pubAddr: process.env.PUB_ADDR,
repAddr: process.env.REP_ADDR,
leaderPubAddr: process.env.LEAD_PUB_ADDR,
leaderRepAddr: process.env.LEAD_REP_ADDR,
leaderClientRepAddr: process.env.LEAD_CLIENT_REP_ADDR,
dlType: process.env.DLT_TYPE,
type: process.env.TYPE,
pubKey: keypair.pk,
privKey: keypair.sk,
};
Expand Down
83 changes: 66 additions & 17 deletions examples/simple-asset-transfer/federations/docker-compose-corda.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,14 @@ services:
corda_validator1:
image: "federation/validator"
environment:
ETCD_HOSTS: "http://etcd1:2379,http://etcd2:2379,http://etcd3:2379"
CLIENT_REP_ADDR: "tcp://192.21.0.2:7009"
PUB_ADDR: "tcp://192.21.0.2:3009"
REP_ADDR: "tcp://192.21.0.2:5009"
URL: "http://192.21.0.1:10051"
USER_NAME: "test"
PASSWORD: "A665A45920422F9D417E4867EFDC4FB8A04A1F3FFF1FA07E998E86F7F7A27AE3"
DLT_TYPE: "CORDA"
TYPE: "LEADER"
mem_limit: 6g
networks:
corda-network:
Expand All @@ -30,21 +30,22 @@ services:
- "7009:7009"
- "3009:3009"
- "5009:5009"

depends_on:
- etcd1
- etcd2
- etcd3

corda_validator2:
image: "federation/validator"
environment:
ETCD_HOSTS: "http://etcd1:2379,http://etcd2:2379,http://etcd3:2379"
CLIENT_REP_ADDR: "tcp://192.21.0.3:7010"
PUB_ADDR: "tcp://192.21.0.3:3010"
REP_ADDR: "tcp://192.21.0.3:5010"
LEAD_PUB_ADDR: "tcp://192.21.0.2:3009"
LEAD_REP_ADDR: "tcp://192.21.0.2:5009"
USER_NAME: "test"
PASSWORD: "A665A45920422F9D417E4867EFDC4FB8A04A1F3FFF1FA07E998E86F7F7A27AE3"
LEAD_CLIENT_REP_ADDR: "tcp://192.21.0.2:7009"
URL: "http://192.21.0.1:10052"
DLT_TYPE: "CORDA"
TYPE: "FOLLOWER"
mem_limit: 6g
expose:
- "10052"
Expand All @@ -56,21 +57,20 @@ services:
corda-network:
ipv4_address: 192.21.0.3
depends_on:
- "corda_validator1"
- etcd1
- etcd2
- etcd3

corda_validator3:
image: "federation/validator"
environment:
ETCD_HOSTS: "http://etcd1:2379,http://etcd2:2379,http://etcd3:2379"
CLIENT_REP_ADDR: "tcp://192.21.0.4:7011"
PUB_ADDR: "tcp://192.21.0.4:3011"
REP_ADDR: "tcp://192.21.0.4:5011"
LEAD_PUB_ADDR: "tcp://192.21.0.2:3009"
LEAD_REP_ADDR: "tcp://192.21.0.2:5009"
USER_NAME: "test"
PASSWORD: "A665A45920422F9D417E4867EFDC4FB8A04A1F3FFF1FA07E998E86F7F7A27AE3"
LEAD_CLIENT_REP_ADDR: "tcp://192.21.0.2:7009"
URL: "http://192.21.0.1:10053"
TYPE: "FOLLOWER"
DLT_TYPE: "CORDA"
mem_limit: 6g
expose:
Expand All @@ -83,22 +83,21 @@ services:
corda-network:
ipv4_address: 192.21.0.4
depends_on:
- "corda_validator1"
- etcd1
- etcd2
- etcd3

corda_validator4:
image: "federation/validator"
environment:
ETCD_HOSTS: "http://etcd1:2379,http://etcd2:2379,http://etcd3:2379"
CLIENT_REP_ADDR: "tcp://192.21.0.5:7012"
PUB_ADDR: "tcp://192.21.0.5:3012"
REP_ADDR: "tcp://192.21.0.5:5012"
LEAD_PUB_ADDR: "tcp://192.21.0.2:3009"
LEAD_REP_ADDR: "tcp://192.21.0.2:5009"
USER_NAME: "test"
PASSWORD: "A665A45920422F9D417E4867EFDC4FB8A04A1F3FFF1FA07E998E86F7F7A27AE3"
LEAD_CLIENT_REP_ADDR: "tcp://192.21.0.2:7009"
URL: "http://192.21.0.1:10054"
DLT_TYPE: "CORDA"
TYPE: "FOLLOWER"
mem_limit: 6g
expose:
- "10054"
Expand All @@ -110,4 +109,54 @@ services:
corda-network:
ipv4_address: 192.21.0.5
depends_on:
- "corda_validator1"
- etcd1
- etcd2
- etcd3

etcd1:
image: bitnami/etcd:3
environment:
- ALLOW_NONE_AUTHENTICATION=yes
- ETCD_NAME=etcd1
- ETCD_INITIAL_ADVERTISE_PEER_URLS=http://etcd1:2380
- ETCD_LISTEN_PEER_URLS=http://0.0.0.0:2380
- ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2379
- ETCD_ADVERTISE_CLIENT_URLS=http://etcd1:2379
- ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster
- ETCD_INITIAL_CLUSTER=etcd1=http://etcd1:2380,etcd2=http://etcd2:2380,etcd3=http://etcd3:2380
- ETCD_INITIAL_CLUSTER_STATE=new
networks:
corda-network:
ipv4_address: 192.21.0.50

etcd2:
image: bitnami/etcd:3
environment:
- ALLOW_NONE_AUTHENTICATION=yes
- ETCD_NAME=etcd2
- ETCD_INITIAL_ADVERTISE_PEER_URLS=http://etcd2:2380
- ETCD_LISTEN_PEER_URLS=http://0.0.0.0:2380
- ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2379
- ETCD_ADVERTISE_CLIENT_URLS=http://etcd2:2379
- ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster
- ETCD_INITIAL_CLUSTER=etcd1=http://etcd1:2380,etcd2=http://etcd2:2380,etcd3=http://etcd3:2380
- ETCD_INITIAL_CLUSTER_STATE=new
networks:
corda-network:
ipv4_address: 192.21.0.51

etcd3:
image: bitnami/etcd:3
environment:
- ALLOW_NONE_AUTHENTICATION=yes
- ETCD_NAME=etcd3
- ETCD_INITIAL_ADVERTISE_PEER_URLS=http://etcd3:2380
- ETCD_LISTEN_PEER_URLS=http://0.0.0.0:2380
- ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2379
- ETCD_ADVERTISE_CLIENT_URLS=http://etcd3:2379
- ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster
- ETCD_INITIAL_CLUSTER=etcd1=http://etcd1:2380,etcd2=http://etcd2:2380,etcd3=http://etcd3:2380
- ETCD_INITIAL_CLUSTER_STATE=new
networks:
corda-network:
ipv4_address: 192.21.0.52
Loading

0 comments on commit 61aab4a

Please sign in to comment.