Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KISS Peer discovery, KISS Bootstrapping, Nodes auto-stake/unstake via K8S cluster-manager, Node Finite state machine - Issues #416, #490, #429, #498, #499 #491

Closed
wants to merge 233 commits into from

Conversation

deblasis
Copy link
Contributor

@deblasis deblasis commented Feb 6, 2023

Description

This PR aims at implementing a KISS implementation of Peer discovery #416

TL;DR: I believe that:
This + State sync = DevNet 🚀

I still need to add the endpoints for accessing the addressbooks but this PR in the meantime shows the auto-Staking/Unstaking functionality.

It works by dogfooding our own CLI which also works as a way for testing the functionality and the "feel" of the tooling we are building.

Areas of improvement

  • Since we can scale up to 999 validators, I had to consider their accounts in genesis and load them up with POKT, that means that our genesis files are quite big, not an issue per se but I don't particularly love it.
  • It would be nice to have ways to interact with the cluster-manager from the debug CLI, for example to turn it on/off dynamically and also to scale up and down without the need for updating the localnet-config.yaml

Issue

Fixes #416
Fixes #490
Fixes #429
Fixes #498
Fixes #499

Type of change

Please mark the relevant option(s):

  • New feature, functionality or library
  • Bug fix
  • Code health or cleanup
  • Major breaking change
  • Documentation
  • Other

List of changes

CLI

  • Added not_interactive flag to allow for non-interactive Stake and Unstake transactions (dogfooding in cluster-manager)
  • Updated CLI to use to source the address book and the current height from the RPC server leveraging the rpcAddressBookProvider and rpcCurrentHeightProvider respectively and the bus for dependency injection

Node binary

  • Updated to handle bootstrap-nodes flag that can be used to override the default bootstrap nodes

Infra

  • Updated genesis to include accounts for all the validators that we can use in LocalNet
  • Updated docker-compose to name the deployment as pocket-v1 instead of deployments (default is the containing folder name)
  • Introduced the cluster-manager that is a standalone microservice in the K8S LocalNet that takes care of (for now) staking/unstaking automatically nodes that are added/removed from the deployment
  • Updated manifests and K8S resources to reflect the new cluster-manager addition
  • In K8S LocalNet, the cli-client now waits for the v1-validator001 since its required for address book sourcing

Consensus

  • Modules embed base_modules.IntegratableModule and base_modules.InterruptableModule for DRYness
  • Updated modules Create to accept generic options
  • resetToGenesis clears the utility mempool as well
  • Publishing ConsensusNewHeightEvent on new height

Logger

  • Module embeds base_modules.IntegratableModule and base_modules.InterruptableModule for DRYness

P2P

  • Modules embed base_modules.IntegratableModule and base_modules.InterruptableModule for DRYness
  • Deprecated debugAddressBookProvider
  • Added rpcAddressBookProvider to source the address book from the RPC server
  • Leveraging bus for dependency injection of the addressBookProvider and currentHeightProvider
  • Deprecated debugCurrentHeightProvider
  • Added rpcCurrentHeightProvider to source the current height from the RPC server
  • Fixed raintree to use the currentHeightProvider instead of consensus (that was what we wanted to avoid in the first place)
  • Added getAddrBookDelta to calculate changes to the address book between heights and update the internal state and componentry accordingly
  • Added basic bootstrap nodes support
  • Reacting to ConsensusNewHeightEventType adn StateMachineTransitionEventType to update the address book and current height and determine if a bootstrap is needed
  • Updated tests

Persistence

  • Module now embeds base_modules.IntegratableModule for DRYness

RPC

  • Updated RPC to expose the node's address book via GET /v1/p2p/address_book
  • Updated modules to embed base_modules.IntegratableModule and base_modules.InterruptableModule for DRYness

Runtime

  • Added bootstrap_nodes_csv in P2PConfig to allow for a comma separated list of bootstrap nodes
  • Introduced modules.ModulesRegistry for better separation of concerns
  • Added StateMachineModule accessors
  • Manager embeds base_modules.IntegratableModule for DRYness

Shared

  • Added UnmarshalText to Ed25519PrivateKey
  • Added events ConsensusNewHeightEvent and StateMachineTransitionEvent
  • Introduced BaseInterruptableModule and BaseIntegratableModule to reduce repetition and boilerpate code (DRYness)
  • Added ModulesRegistry and StateMachineModule accessors and interfaces
  • Introduced generic ModuleOption pattern to fine tune modules behaviour
  • Added StateMachine to the node initialization

State Machine 🆕

  • Introduced this CHANGELOG.md and README.md
  • Added StateMachineModule implementation with a POC of the finite state machine that will be used to manage the node lifecycle
  • Added StateMachine diagram generator (linked in README.md)
  • Integrated the StateMachine with the bus to propagate StateMachineTransitionEvent events whenever they occur

Telemetry

  • Modules embed base_modules.IntegratableModule and base_modules.InterruptableModule for DRYness

Utility

  • Module embeds base_modules.IntegratableModule and base_modules.InterruptableModule for DRYness
  • Logging error if ApplyTransaction fails (it was completely ignored before and it was really hard to understand what was going on)

Testing

  • make develop_test
  • LocalNet w/ all of the steps outlined in the README

Required Checklist

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have tested my changes using the available tooling
  • I have updated the corresponding CHANGELOG

If Applicable Checklist

  • I have updated the corresponding README(s); local and/or global
  • I have added tests that prove my fix is effective or that my feature works
  • I have added, or updated, mermaid.js diagrams in the corresponding README(s)
  • I have added, or updated, documentation and mermaid.js diagrams in shared/docs/* if I updated shared/*README(s)

…-discovery-k8s-base354

Signed-off-by: Alessandro De Blasis <alex@deblasis.net>
Copy link
Member

@Olshansk Olshansk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took another stab and ~75% done + replied to most of the responses. I think if we just split this into 2 PRs: FSM and k8s client, we'll be good to go and the reviews should be much faster given how many back & forth we've had here.

I have to say, for a PR that is trying to KISS, these screenshot speaks for itself ;)

Screenshot 2023-02-15 at 9 44 09 PM

Screenshot 2023-02-15 at 9 41 16 PM

app/client/cli/debug.go Show resolved Hide resolved
build/deployments/docker-compose.yaml Show resolved Hide resolved
app/client/keybase/debug/keystore.go Outdated Show resolved Hide resolved
build/localnet/cluster-manager/main.go Show resolved Hide resolved
build/localnet/cluster-manager/main.go Show resolved Hide resolved
shared/modules/state_machine_module.go Outdated Show resolved Hide resolved
shared/modules/module.go Show resolved Hide resolved
state_machine/fsm.go Outdated Show resolved Hide resolved
state_machine/docs/state-machine.diagram.md Show resolved Hide resolved
shared/crypto/ed25519.go Show resolved Hide resolved
…-discovery-k8s-base354

Signed-off-by: Alessandro De Blasis <alex@deblasis.net>
@deblasis
Copy link
Contributor Author

Took another stab and ~75% done + replied to most of the responses. I think if we just split this into 2 PRs: FSM and k8s client, we'll be good to go and the reviews should be much faster given how many back & forth we've had here.

I have to say, for a PR that is trying to KISS, these screenshot speaks for itself ;)

I am sorry this is frustrating. I should have taken action and split it when it was code complete even if you said it was OK.

I am going to explain in protocol hour my reasoning for this.

I have now merged in the bugfixes I worked on yesterday and I am finally going to proceed with the splitting.
I will address your comments in the relative PR and backlink to help future readers.

deblasis added a commit that referenced this pull request Feb 17, 2023
…#520)

## Description

This PR has been extracted from #491 and is, hopefully, more digestible
from a code-review and scope point of view.

In a nutshell: 
- It introduces the a Finite State Machine that is meant to govern the
internal state, transitions and events
- It includes a refactoring of our module initialization pattern using
functional options
(https://dave.cheney.net/2014/10/17/functional-options-for-friendly-apis)
- It improves the module registration with a first class... you might
have guessed it: `ModulesRegistry`
- It reduces boilerplate code making our modules more DRY with the
introduction of `base_modules` that provide basic functionality (that
can still be customized/extended when needed)

## Issue

Fixes #499 

## Type of change

Please mark the relevant option(s):

- [x] New feature, functionality or library
- [ ] Bug fix
- [x] Code health or cleanup
- [x] Major breaking change
- [ ] Documentation
- [ ] Other <!-- add details here if it a different type of change -->

## List of changes

In a nutshell: 
- It introduces the a Finite State Machine that is meant to govern the
internal state, transitions and events
- It includes a refactoring of our module initialization pattern using
functional options
(https://dave.cheney.net/2014/10/17/functional-options-for-friendly-apis)
- It improves the module registration with a first class... you might
have guessed it: `ModulesRegistry`
- It reduces boilerplate code making our modules more DRY with the
introduction of `base_modules` that provide basic functionality (that
can still be customized/extended when needed)

## Testing

- [x] `make develop_test`
- [x]
[LocalNet](https://github.com/pokt-network/pocket/blob/main/docs/development/README.md)
w/ all of the steps outlined in the `README`


## Required Checklist

- [x] I have performed a self-review of my own code
- [x] I have commented my code, particularly in hard-to-understand areas
- [x] I have tested my changes using the available tooling
- [x] I have updated the corresponding CHANGELOG

### If Applicable Checklist

- [ ] I have updated the corresponding README(s); local and/or global
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added, or updated,
[mermaid.js](https://mermaid-js.github.io) diagrams in the corresponding
README(s)
- [ ] I have added, or updated, documentation and
[mermaid.js](https://mermaid-js.github.io) diagrams in `shared/docs/*`
if I updated `shared/*`README(s)

---------

Co-authored-by: Daniel Olshansky <olshansky@pokt.network>
@deblasis deblasis closed this Feb 17, 2023
deblasis added a commit that referenced this pull request Feb 17, 2023
…429) (#521)

## Description

This PR has been extracted from
#491 and is, hopefully, more
digestible from a code-review and scope point of view.

The main goal is to remove hardcoded nodes and move towards a more
dynamic environment.

It's also highlighting the potential entry points for subsequent P2P
work

The code leverages the abstractions added recently
(`currentHeightProvider` and `addressBookProvider`) to fetch the data
from an RPC endpoint.

## Issue

Fixes #416
Fixes #429

## Type of change

Please mark the relevant option(s):

- [x] New feature, functionality or library
- [ ] Bug fix
- [ ] Code health or cleanup
- [x] Major breaking change
- [ ] Documentation
- [ ] Other <!-- add details here if it a different type of change -->

## List of changes

### CLI

- Updated CLI to use to source the address book and the current height
from the RPC server leveraging the `rpcAddressBookProvider` and
`rpcCurrentHeightProvider` respectively and the `bus` for dependency
injection

### P2P

- Modules embed `base_modules.IntegratableModule` and
`base_modules.InterruptableModule` for DRYness
- Deprecated `debugAddressBookProvider`
- Added `rpcAddressBookProvider` to source the address book from the RPC
server
- Leveraging `bus` for dependency injection of the `addressBookProvider`
and `currentHeightProvider`
- Deprecated `debugCurrentHeightProvider`
- Added `rpcCurrentHeightProvider` to source the current height from the
RPC server
- Fixed raintree to use the `currentHeightProvider` instead of consensus
(that was what we wanted to avoid in the first place)
- Added `getAddrBookDelta` to calculate changes to the address book
between heights and update the internal state and componentry
accordingly
- Reacting to `ConsensusNewHeightEventType` to update the address book 
- Updated tests

### RPC

- Updated RPC to expose the node's address book via GET
/v1/p2p/staked_actors_address_book


## Testing

- [x] `make develop_test`
- [x]
[LocalNet](https://github.com/pokt-network/pocket/blob/main/docs/development/README.md)
w/ all of the steps outlined in the `README`

<!-- REMOVE this comment block after following the instructions
 If you added additional tests or infrastructure, describe it here.
 Bonus points for images and videos or gifs.
-->

## Required Checklist

- [x] I have performed a self-review of my own code
- [x] I have commented my code, particularly in hard-to-understand areas
- [x] I have tested my changes using the available tooling
- [x] I have updated the corresponding CHANGELOG

### If Applicable Checklist

- [ ] I have updated the corresponding README(s); local and/or global
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added, or updated,
[mermaid.js](https://mermaid-js.github.io) diagrams in the corresponding
README(s)
- [ ] I have added, or updated, documentation and
[mermaid.js](https://mermaid-js.github.io) diagrams in `shared/docs/*`
if I updated `shared/*`README(s)

---------

Signed-off-by: Alessandro De Blasis <alex@deblasis.net>
Co-authored-by: Dmitry K <okdas@pm.me>
Co-authored-by: Dmitry Knyazev <okdas@users.noreply.github.com>
Co-authored-by: Daniel Olshansky <olshansky@pokt.network>
Co-authored-by: Daniel Olshansky <olshansky.daniel@gmail.com>
deblasis added a commit that referenced this pull request Feb 17, 2023
… (#522)

## Description

This PR has been extracted from #491 and is, hopefully, more digestible
from a code-review and scope point of view.

## Issue

Fixes #490 

## Type of change

Please mark the relevant option(s):

- [x] New feature, functionality or library
- [ ] Bug fix
- [ ] Code health or cleanup
- [ ] Major breaking change
- [ ] Documentation
- [ ] Other <!-- add details here if it a different type of change -->

## List of changes

- When nodes are added/removed from the Kubernetes Localnet, we
stake/unstake them automatically
- We achieve the above by dogfooding our own CLI inside Kubernetes

## Testing

- [x] `make develop_test`
- [x]
[LocalNet](https://github.com/pokt-network/pocket/blob/main/docs/development/README.md)
w/ all of the steps outlined in the `README`


## Required Checklist

- [x] I have performed a self-review of my own code
- [x] I have commented my code, particularly in hard-to-understand areas
- [x] I have tested my changes using the available tooling
- [x] I have updated the corresponding CHANGELOG

### If Applicable Checklist

- [ ] I have updated the corresponding README(s); local and/or global
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added, or updated,
[mermaid.js](https://mermaid-js.github.io) diagrams in the corresponding
README(s)
- [ ] I have added, or updated, documentation and
[mermaid.js](https://mermaid-js.github.io) diagrams in `shared/docs/*`
if I updated `shared/*`README(s)

---------

Signed-off-by: Alessandro De Blasis <alex@deblasis.net>
Co-authored-by: Dmitry Knyazev <okdas@users.noreply.github.com>
Co-authored-by: Daniel Olshansky <olshansky@pokt.network>
Co-authored-by: Dmitry K <okdas@pm.me>
Co-authored-by: Daniel Olshansky <olshansky.daniel@gmail.com>
deblasis added a commit that referenced this pull request Feb 20, 2023
…ue: #498) (#523)

## Description

This PR has been extracted from #491 and is, hopefully, more digestible
from a code-review and scope point of view.

This simply adds entrypoints and basic logic for node bootstrapping
leveraging the finite state machine

## Issue

Fixes #498 

## Type of change

Please mark the relevant option(s):

- [x] New feature, functionality or library
- [ ] Bug fix
- [ ] Code health or cleanup
- [ ] Major breaking change
- [ ] Documentation
- [ ] Other <!-- add details here if it a different type of change -->

## List of changes

- Ability to fetch an addressbook from a bootstrap node(s) (by
convention the first validator in LocalNet)
- It's possible to override bootstrap nodes via CLI flag

## Testing

- [x] `make develop_test`
- [x]
[LocalNet](https://github.com/pokt-network/pocket/blob/main/docs/development/README.md)
w/ all of the steps outlined in the `README`

## Required Checklist

- [x] I have performed a self-review of my own code
- [x] I have commented my code, particularly in hard-to-understand areas
- [x] I have tested my changes using the available tooling
- [x] I have updated the corresponding CHANGELOG

### If Applicable Checklist

- [ ] I have updated the corresponding README(s); local and/or global
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added, or updated,
[mermaid.js](https://mermaid-js.github.io) diagrams in the corresponding
README(s)
- [ ] I have added, or updated, documentation and
[mermaid.js](https://mermaid-js.github.io) diagrams in `shared/docs/*`
if I updated `shared/*`README(s)

---------

Signed-off-by: Alessandro De Blasis <alex@deblasis.net>
Co-authored-by: Dmitry K <okdas@pm.me>
Co-authored-by: Dmitry Knyazev <okdas@users.noreply.github.com>
Co-authored-by: Daniel Olshansky <olshansky@pokt.network>
Co-authored-by: Daniel Olshansky <olshansky.daniel@gmail.com>
@Olshansk Olshansk deleted the issue/416-kiss-peer-discovery-k8s-base354 branch June 2, 2023 21:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code health Nice to have code improvement infra Core infrastructure - not protocol related p2p P2P specific changes
Projects
Status: Done
4 participants