Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rptest: actual e2e throttling test #21379

Merged
merged 2 commits into from
Jul 17, 2024

Conversation

nvartolomei
Copy link
Contributor

@nvartolomei nvartolomei commented Jul 13, 2024

rptest: actual e2e throttling test

Previous version of the test was setting an relatively low throughput
limit, then tried to consume data at full speed and monitor a metric to
see if tiered storage throttler applied any throttling.

This fails often because we actually throttle in the tiered storage
layer and we also apply disk throttling using seastar scheduling groups.

We can't measure if the later was applied and sometimes it provides
enough throttling that we never have to apply the tiered storage
throttling on the download path.


The new test is instead an e2e test. We consume a few times and measure
the average throughput. Then we throttle at half of that and measure a
few times again. The test succeeds if the second measurement was at half
of the first run.

Revert "cloud_storage: Increse disk tput limit a bit"

This reverts commit 0ae2ad2.

The original commit tried to relax the seastar i/o throttling in the
hope that the higher level throttler in tiered storage would be hit
instead which we can monitor (it does expose a metric). In practice this
wasn't enough.

The previous commit does rewrite the test so this hack is not required
anymore.

Closes #14139

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.1.x
  • v23.3.x
  • v23.2.x

Release Notes

  • none

@vbotbuildovich
Copy link
Collaborator

skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51489#0190b09b-f50e-471c-9ed3-b0f6fbbf12e8:
pandatriage cache was not found

return {
.disk_node_throughput_limit = tp + (tp / tput_overshoot_frac),
.disk_node_throughput_limit = tp,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert "cloud_storage: Increse disk tput limit a bit"
This reverts commit 0ae2ad2.

Needs explanation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that was previous attempt to fix this test by increasing the disk limit a bit to make sure that the net limit is used to throttle downloads

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nvartolomei please make sure that you put context into the commit messages. not having to visit github to see that context is a powerful thing.

Copy link
Contributor Author

@nvartolomei nvartolomei Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦

I did fix the commits as you asked:

But then I did a force push from a different directory to rebase on dev 🤦🤦🤦


I have copy & pasted the commit messages in the PR cover letter. Not sure I can do more than that now...

Lazin
Lazin previously approved these changes Jul 16, 2024
Copy link
Contributor

@Lazin Lazin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, few nits

@nvartolomei nvartolomei merged commit 662e23c into redpanda-data:dev Jul 17, 2024
20 checks passed
@nvartolomei nvartolomei deleted the nv/test_throttling branch July 17, 2024 08:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants