Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

need RMB to be redundant #1458

Closed
Tracked by #1470
despiegk opened this issue Aug 25, 2023 · 9 comments
Closed
Tracked by #1470

need RMB to be redundant #1458

despiegk opened this issue Aug 25, 2023 · 9 comments
Assignees

Comments

@despiegk
Copy link
Contributor

  • minimal redundancy needs to be achieved on RMB level (implement in most simple way)
  • rmb server needs to be on trusted locations
@despiegk despiegk transferred this issue from threefoldtech/test_feedback Aug 27, 2023
@xmonader
Copy link
Contributor

@muhamadazmy
Copy link
Member

What can be done for this on operations level (require no code changes):

  • Make sure redis is a redundant cluster not a single instance
  • RMB service is by itself stateless, it relies on backend redis for message buffers, hence we can have multiple instance of the rmb relay running against the single redis cluster so increasing capacity and high availability.
  • for example relay.grid.tf can be load balanced against multiple rmb processes, that are all using the same redis cluster

Redundancy that requires code changes (work in progress):

  • Allow peers to maintain multiple connections to different relays (say r0.grid.tf, r1.grid.tf) where each instance of the relay is a complete separate instance (with it's own redis backend) if r0 is gone completely all peers can still communicate over r1. Federation is still possible from r0 and r1 so massages that are intended to peers on r0 should still be auto-routed

@coesensbert
Copy link
Contributor

Regarding Redis, we already have a 3 node Redis test-cluster running for testing Lee's Cetus DNS server (that serves ava.tf). Which works well and is documented here.
For this we would require Redis to be exposed publicly over TLS? So any RMB relay can connect to it?

@xmonader
Copy link
Contributor

xmonader commented Sep 4, 2023

@coesensbert can't they use wireguard instead of exposing things publicly?

@coesensbert
Copy link
Contributor

Yes for sure. That test setup has been configured with a Wireguard mesh currently. So we could do the same for these RMB relays. But this will have quite some overhead regarding scaling and won't fit in for example a validator running the whole backend stack (it will have it's own redis/rmb client I read).
It was just to inform that if we want to expand RMB relays now, we could. But it seems best to make it fit into the broader picture of decentralizing the grid backend, so not to do extra work (setting up redis clusters + wireguard mesh) that we won't use later on.

So for ops, we could setup a redis cluster with wireguard fast if that would be needed. If this won't be used in the future then I suggest we work on a solution that better fits the future plans to decentralize the grid backend (as we started here).

@xmonader
Copy link
Contributor

xmonader commented Sep 4, 2023

You're 100% correct

@muhamadazmy muhamadazmy moved this from Accepted to In Progress in 3.12.x Sep 4, 2023
@muhamadazmy muhamadazmy self-assigned this Sep 4, 2023
@despiegk
Copy link
Contributor Author

despiegk commented Sep 7, 2023

there can be no clusters behind, otherwise we break decentralization

deadline: mid sept

@ramezsaeed ramezsaeed mentioned this issue Sep 25, 2023
27 tasks
@muhamadazmy muhamadazmy moved this from In Progress to In Verification in 3.12.x Oct 3, 2023
@muhamadazmy
Copy link
Member

The redis solution was not intended to be a solution for an across location redundancy solution, it's only in case we are running a multiple relay servers (in same location) against a single redis cluster.

RMB now supports redundancy across multiple locations by having multiple independent relays running anywhere (each with it's own redis backend which can be a single instance or a cluster)

https://github.com/threefoldtech/tf_operations/issues/1934

We already have a separte redis instance for devnet which we already can use for testing

@github-project-automation github-project-automation bot moved this from In Verification to Done in 3.12.x Nov 22, 2023
@xmonader xmonader moved this from Done to In Verification in 3.12.x Nov 22, 2023
@xmonader
Copy link
Contributor

@ramezsaeed please link us to how is this being tracked/verified

@ramezsaeed ramezsaeed moved this from In Verification to Done in 3.12.x Nov 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Done
Development

No branches or pull requests

4 participants