-
Notifications
You must be signed in to change notification settings - Fork 600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cloud_storage: Fix download throttling ducktape test #17455
Conversation
c55ef87
to
7c939d8
Compare
src/v/utils/token_bucket.h
Outdated
@@ -62,7 +73,21 @@ class token_bucket { | |||
_refresh_timer.arm(refresh_interval - elapsed); | |||
} | |||
|
|||
return _sem.wait(as, size); | |||
auto deficiency = size - _sem.available_units(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you mean available_units
here vs one of the other semaphore interfaces for querying units? i ask because this one can become negative depending on how the semaphore is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in this context it should go negative but still, I think that the deficiency value could be incorrect in case of concurrency, I'll update this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please submit other changes in a different pr. we need the fix for the ci failure, and there are other failures in the ci run for this pr that are probably unrelated to the other changes in the pr--they look fine to me--but we can do that separately.
separate fix #17474 |
Implement 'maybe_throttle' method. The method works the same way as 'throttle'. It returns an optional value which is set to nullopt if there was no throttling. If the throttling was applied it returns the deficiency value (number of units we had to wait for) and the sleep duration.
Previously, the throttling metric was computed before calling 'throttle'. If number of available units was lower than the number of bytes to read we caclulated deficiency value as a difference between the two. In case of concurrency this could lead to incorrect value of the metric. If two fibers will do this check and find out the available untis are too low.
7c939d8
to
285c8e0
Compare
Updated the token_bucket implementation. Now it returns only the amount of time that the bucket was waiting for units. |
_parent._read_path_probe.download_throttled( | ||
buffer_size - available); | ||
auto throttled = co_await _parent._throughput_limit.maybe_throttle( | ||
buffer_size, _as); | ||
if (throttled) { | ||
// This path is taken only when the download is throttled. In case | ||
// of concurrency every `maybe_throttle` call will return the | ||
// precise value for deficiency and sleep time. | ||
auto duration | ||
= std::chrono::duration_cast<std::chrono::milliseconds>( | ||
throttled.value()); | ||
_parent._read_path_probe.download_throttled(duration.count()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the units for the parameter passed to download_throttled? It looks like this switches from bytes to milliseconds. That's probably worth explaining in the commit message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the description of the metric. It's not possible to reliably capture the metric in bytes (and such metric doesn't make much sense tbh).
Fix the throttling ducktape test. The test was failing because additional restart reset the metric.
Additionally, the PR improves the way the metric is collected and updates the token bucket implementation to improve observability.
Fixes #14139
Backports Required
Release Notes