Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adder-collator: add velocity measurement and make elastic scaling test more robust #4016

Merged
merged 4 commits into from
Apr 8, 2024

Conversation

sandreim
Copy link
Contributor

@sandreim sandreim commented Apr 7, 2024

Improves adder-collator to also compute the parachain velocity. The velocity is defined as number of parachain blocks progressing per relay chain block.

In this test we're asserting that the elastic parachain always progresses by 3 blocks per RCB, while the non-elastic parachain progresses normally - 1 block per RCB.

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
@sandreim sandreim added R0-silent Changes should not be mentioned in any release notes T10-tests This PR/Issue is related to tests. labels Apr 7, 2024
@sandreim sandreim marked this pull request as draft April 7, 2024 19:11
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
@sandreim sandreim marked this pull request as ready for review April 8, 2024 10:33
@sandreim
Copy link
Contributor Author

sandreim commented Apr 8, 2024

Passed on my machine, hope it passes in CI

@sandreim sandreim changed the title Zombienet: make elastic scaling test more robust adder-collator: add velocity measurement and make elastic scaling test more robust Apr 8, 2024
Copy link
Member

@ordian ordian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test passed on the first try, so looks good.

@ordian ordian enabled auto-merge April 8, 2024 12:05
@ordian ordian added this pull request to the merge queue Apr 8, 2024
Merged via the queue into master with commit 039d183 Apr 8, 2024
129 of 134 checks passed
@ordian ordian deleted the sandreim/fix_flaky_test branch April 8, 2024 13:11
@ordian
Copy link
Member

ordian commented Apr 8, 2024

@bkchr
Copy link
Member

bkchr commented Apr 8, 2024

The test passed on the first try, so looks good.

Flaky tests have the property of doing so :P

@ordian
Copy link
Member

ordian commented Apr 8, 2024

I'd argue that in this test what we actually want to measure depends on the hardware not being flaky/potato, which it is not in CI :P

@sandreim
Copy link
Contributor Author

sandreim commented Apr 8, 2024

damn, but I wonder how much we overcommit on these CI instances, maybe we should run this on with benchmark tag.

@sandreim
Copy link
Contributor Author

sandreim commented Apr 8, 2024

@pepoviola do we have any way of reducing overcommit on the cluster instances for some tests ?

@sandreim
Copy link
Contributor Author

sandreim commented Apr 8, 2024

Looking at the test resources we have for relay chain:

  limits = { memory = "4G", cpu = "2" }
  requests = { memory = "2G", cpu = "1" }

Does this work for collators ? And do we actually reserve these resources vs other work ?

@pepoviola
Copy link
Contributor

Looking at the test resources we have for relay chain:

  limits = { memory = "4G", cpu = "2" }
  requests = { memory = "2G", cpu = "1" }

Does this work for collators ? And do we actually reserve these resources vs other work ?

Yes, this will work for collators too. As reference we use vms with 8 vcpus/16G of ram. So, you can increase the request to place 2 collators per vm (or even one but let me know to notice in the budget). Ping me if I can help you with something else.

Thanks!!

@bkchr
Copy link
Member

bkchr commented Apr 8, 2024

maybe we should run this on with benchmark tag

I had the same idea. Ser @sandreim, can we not just run zombienet with the local provider on the benchmarking machine? :D

Ank4n pushed a commit that referenced this pull request Apr 9, 2024
…t more robust (#4016)

Improves `adder-collator` to also compute the parachain velocity. The
velocity is defined as number of parachain blocks progressing per relay
chain block.

In this test we're asserting that the elastic parachain always
progresses by 3 blocks per RCB, while the non-elastic parachain
progresses normally - 1 block per RCB.

---------

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Co-authored-by: ordian <write@reusable.software>
@pepoviola
Copy link
Contributor

maybe we should run this on with benchmark tag

I had the same idea. Ser @sandreim, can we not just run zombienet with the local provider on the benchmarking machine? :D

We can add the run-native version but we need to wire all the monitoring stack (that is already there with kubernetes). Other option is to add a label in the config for the nodes that needs to run in an isolated vm to get better performance and that way we can have the granularity to set one vm per node if needed.

Thx!

@sandreim
Copy link
Contributor Author

sandreim commented Apr 9, 2024

maybe we should run this on with benchmark tag

I had the same idea. Ser @sandreim, can we not just run zombienet with the local provider on the benchmarking machine? :D

We could do that, but as @pepoviola is saying we would have trouble getting the logs when a test fails. I have hopes in a new label for pods or just better k8s scheduling of pods such that the claimed resources are always guaranteed by it. Javier is looking into this right now.

dharjeezy pushed a commit to dharjeezy/polkadot-sdk that referenced this pull request Apr 9, 2024
…t more robust (paritytech#4016)

Improves `adder-collator` to also compute the parachain velocity. The
velocity is defined as number of parachain blocks progressing per relay
chain block.

In this test we're asserting that the elastic parachain always
progresses by 3 blocks per RCB, while the non-elastic parachain
progresses normally - 1 block per RCB.

---------

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Co-authored-by: ordian <write@reusable.software>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
R0-silent Changes should not be mentioned in any release notes T10-tests This PR/Issue is related to tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants