Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test resharing when rpc is congested #667

Merged
merged 1 commit into from
Jul 9, 2024
Merged

Test resharing when rpc is congested #667

merged 1 commit into from
Jul 9, 2024

Conversation

ailisp
Copy link
Member

@ailisp ailisp commented Jul 3, 2024

Address near/transfer#33.

Make each node -> RPC with different configurable network problem. (Previously was every node -> same toxiproxy -> lake RPC)

When latency is very high (10s) for a single node but the other node doesn't have latency to RPC, the resharing will fail. Will investigate more on this case.

Copy link
Collaborator

@volovyks volovyks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Final goal here is to make sure resharing always succeed by waiting for all required conditions which are T or more active nodes with up to date data.

So we need to simulate both cases:

  • Node is offline, there are only T-1 nodes and the resharing process is started, all the nodes are waiting for the Node 1 to get back online and finish the resharing process.
  • One or more nodes are having issues with getting up to date data from the contract (the nu,ber of active participants is below T). Once T nodes are able to communicate with contract without delays, resharing process should finish successfully.

Copy link
Contributor

@ppca ppca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing I've observed while doing contract init (the nodes doing initial key generation), if some node joined network a bit later, e.g. the node started a bit late, then all nodes will be stuck at generating status. Not sure if this is similar to the 10s issue mentioned in description.

@ailisp
Copy link
Member Author

ailisp commented Jul 9, 2024

Another thing I've observed while doing contract init (the nodes doing initial key generation), if some node joined network a bit later, e.g. the node started a bit late, then all nodes will be stuck at generating status. Not sure if this is similar to the 10s issue mentioned in description.

Good observation, will try to test and fix this case as well

@ailisp
Copy link
Member Author

ailisp commented Jul 9, 2024

@volovyks resharing is more important than contract initialization as it's not a one time event. I'll ensure two cases you mentioned are working or fix it

@ailisp ailisp merged commit 065fb02 into develop Jul 9, 2024
2 of 3 checks passed
@ailisp ailisp deleted the latency-resharing branch July 9, 2024 02:27
Copy link

github-actions bot commented Jul 9, 2024

Terraform Feature Environment Destroy (dev-667)

Terraform Initialization ⚙️success

Terraform Destroy success

Show Destroy Plan


No changes. No objects need to be destroyed.

Either you have not created any objects yet or the existing objects were
already deleted outside of Terraform.

Destroy complete! Resources: 0 destroyed.

Pusher: @ailisp, Action: pull_request, Working Directory: ``, Workflow: Terraform Feature Env (Destroy)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants