Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Infra] K8S Cluster manager #490

Closed
14 tasks
deblasis opened this issue Feb 6, 2023 · 1 comment · Fixed by #522
Closed
14 tasks

[Infra] K8S Cluster manager #490

deblasis opened this issue Feb 6, 2023 · 1 comment · Fixed by #522
Assignees
Labels
infra Core infrastructure - not protocol related p2p P2P specific changes

Comments

@deblasis
Copy link
Contributor

deblasis commented Feb 6, 2023

Objective

Now that we have the ability to scale up and down our clusters of nodes in Localnet, via #354, we need to be able to stake/unstake them dynamically and with the minimum amount of friction possible for the developer.

Ideally, everything should be manageable using the available tooling.

This is a foundational piece of work that will also unlock M2 related tasks.

Origin Document

While developing #416 I had to figure out a way to spin up and down new nodes using the previous/current docker-compose based infra and also stake/unstake them so that they could join consensus.
It was starting to become quite convoluted, with lots of commands and manual work even for simple operations like "adding a new node".
I had a couple of false starts but it was all needed to map out the required changes to have a functional DevNet.

#354 allows us to leverage Kubernetes and its scheduling capabilities to achieve this.
The integration is possible thanks to the library k8s.io/client-go

Goals

  • Develop a cluster-manager/orchestrator/operator (naming TBD, it's not my forte... I went with cluster-manager for now in my WIP PR) capable of reacting to K8S events related to validators being added or removed to the deployment
  • Stake / Unstake automatically the nodes as the come online/go offline by dogfooding the existing CLI/RPC
  • [bonus] expose the required RPC endpoints so that the Debug CLI can control the number of the validators without having to edit files manually (could be a separate issue)

Deliverable

  • cluster-manager implementation that reacts to the specific K8S events via k8s.io/client-go
  • Ability to read the privatekeys of the validators so that it's possible to construct the CLI commands for staking / unstaking
  • Integrate with our own CLI/RPC so send Stake and Unstake transactions
  • Updated Tiltfile and K8S manifests for handling the new binary(es)

Non-goals / Non-deliverables

  • ...

General issue deliverables

  • Update the appropriate CHANGELOG(s)
  • Update any relevant local/global README(s)
  • Update relevant source code tree explanations
  • Add or update any relevant or supporting mermaid diagrams

Testing Methodology

  • Scale up
    • Add a validator
    • Trigger next round
    • New validator should have joined consensus
  • Scale down
    • Remove a validator
    • Trigger next consensus round
    • Removed validator should have left consensus
  • All tests: make test_all
  • LocalNet: verify a LocalNet is still functioning correctly by following the instructions at docs/development/README.md

Creator: @deblasis
Co-Owners: @Olshansk , @okdas

@deblasis deblasis added p2p P2P specific changes infra Core infrastructure - not protocol related labels Feb 6, 2023
@deblasis deblasis self-assigned this Feb 6, 2023
@jessicadaugherty jessicadaugherty moved this to In Research in V1 Dashboard Feb 6, 2023
@jessicadaugherty jessicadaugherty moved this from In Research to In Progress in V1 Dashboard Feb 6, 2023
@Olshansk
Copy link
Member

Olshansk commented Feb 8, 2023

@deblasis Ticket looks 👌 to me. I didn't edit anything since you're the one implementing it anyhow.

I'll think of a good name. cluster-manager makes sense, and network-manager is too broad, so my mind is going in the pocket-manager, actor-manager or pocket-puppeteer 🎎, but feels like it's pushing it.

@deblasis deblasis moved this from In Progress to In Review in V1 Dashboard Feb 9, 2023
@deblasis deblasis linked a pull request Feb 17, 2023 that will close this issue
16 tasks
deblasis added a commit that referenced this issue Feb 17, 2023
… (#522)

## Description

This PR has been extracted from #491 and is, hopefully, more digestible
from a code-review and scope point of view.

## Issue

Fixes #490 

## Type of change

Please mark the relevant option(s):

- [x] New feature, functionality or library
- [ ] Bug fix
- [ ] Code health or cleanup
- [ ] Major breaking change
- [ ] Documentation
- [ ] Other <!-- add details here if it a different type of change -->

## List of changes

- When nodes are added/removed from the Kubernetes Localnet, we
stake/unstake them automatically
- We achieve the above by dogfooding our own CLI inside Kubernetes

## Testing

- [x] `make develop_test`
- [x]
[LocalNet](https://github.com/pokt-network/pocket/blob/main/docs/development/README.md)
w/ all of the steps outlined in the `README`


## Required Checklist

- [x] I have performed a self-review of my own code
- [x] I have commented my code, particularly in hard-to-understand areas
- [x] I have tested my changes using the available tooling
- [x] I have updated the corresponding CHANGELOG

### If Applicable Checklist

- [ ] I have updated the corresponding README(s); local and/or global
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added, or updated,
[mermaid.js](https://mermaid-js.github.io) diagrams in the corresponding
README(s)
- [ ] I have added, or updated, documentation and
[mermaid.js](https://mermaid-js.github.io) diagrams in `shared/docs/*`
if I updated `shared/*`README(s)

---------

Signed-off-by: Alessandro De Blasis <alex@deblasis.net>
Co-authored-by: Dmitry Knyazev <okdas@users.noreply.github.com>
Co-authored-by: Daniel Olshansky <olshansky@pokt.network>
Co-authored-by: Dmitry K <okdas@pm.me>
Co-authored-by: Daniel Olshansky <olshansky.daniel@gmail.com>
@github-project-automation github-project-automation bot moved this from In Review to Done in V1 Dashboard Feb 17, 2023
bryanchriswhite added a commit that referenced this issue Feb 20, 2023
* pokt/main:
  [Infra] KISS 3 - Cluster Manager [Merge me after #521] - (Issues: #490) (#522)
  Refactor/fix state sync logs (#515)
  [P2P] KISS 2 - Peer discovery [Merge me after #520] - (Issues: #416, #429) (#521)
  [Core] KISS 1 - Finite State Machine [Merge me first] - (Issue: #499) (#520)
  [CLI] Stake command bugfix (#518)
  [CLI] Cannot run make localnet_client_debug: Cannot initialise the keybase with the validator keys: Unable to find YAML file (#517)
  Fix the link shown by `make go_doc`
  Fixed duplicate GITHUB_WIKI tag
  [Documentation] Update Devlog Formatting (#512)
  [Docs & Bugs] Minor fixes post keybase changes (#513)
  [Utility] Foundational bugs, tests, code cleanup and improvements (1 / 2) (#503)
  [Tooling] Integrate Keybase w/ CLI (Issue #484 ) (#501)
  update devlog2.md
  update devlog2.md
  Update devlog1.md
bryanchriswhite added a commit that referenced this issue Feb 20, 2023
* pokt/main:
  [Infra] KISS 3 - Cluster Manager [Merge me after #521] - (Issues: #490) (#522)
  Refactor/fix state sync logs (#515)
  [P2P] KISS 2 - Peer discovery [Merge me after #520] - (Issues: #416, #429) (#521)
  [Core] KISS 1 - Finite State Machine [Merge me first] - (Issue: #499) (#520)
  [CLI] Stake command bugfix (#518)
  [CLI] Cannot run make localnet_client_debug: Cannot initialise the keybase with the validator keys: Unable to find YAML file (#517)
  Fix the link shown by `make go_doc`
  Fixed duplicate GITHUB_WIKI tag
  [Documentation] Update Devlog Formatting (#512)
  [Docs & Bugs] Minor fixes post keybase changes (#513)
  [Utility] Foundational bugs, tests, code cleanup and improvements (1 / 2) (#503)
  [Tooling] Integrate Keybase w/ CLI (Issue #484 ) (#501)
  update devlog2.md
  update devlog2.md
  Update devlog1.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment