Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve average performance of DefaultLongTaskTimer for out-of-order stopping #5591

Merged

Conversation

fogninid
Copy link
Contributor

The current implementation of DefaultLongTaskTimer optimizes for O(1) task starting, but performs poorly when stopping tasks that are not at the beginning of its internal queue (the oldest ones).
At the worst case, when calling stop immediately after starting, the stop call is currently expected to require O(N) operations.
Depending on the distribution of task lifetimes, the average case would be O(1) only for applications that stop tasks in exactly the same order as they were started; applications completing out-of-order and with unbiased lifetime would experience O(N) average.

Task stopping should not have any intrinsic difference to starting: both action are expected to be performed on application threads, and for a well-functioning application (that is not leaking of piling-up tasks) every call to start is matched by exactly one call to stop.

@pivotal-cla
Copy link

@fogninid Please sign the Contributor License Agreement!

Click here to manually synchronize the status of this Pull Request.

See the FAQ for frequently asked questions.

@pivotal-cla
Copy link

@fogninid Thank you for signing the Contributor License Agreement!

@fogninid fogninid force-pushed the long_task_timer_performance branch 2 times, most recently from aba71f6 to 295d518 Compare October 14, 2024 14:55
@shakuzen shakuzen added enhancement A general enhancement performance Issues related to general performance module: micrometer-core An issue that is related to our core module labels Oct 15, 2024
@shakuzen shakuzen added this to the 1.15.0-M1 milestone Oct 16, 2024
Copy link
Member

@shakuzen shakuzen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pull request. What you wrote makes sense. Still, I wanted to verify with some JMH benchmarks so we can put some numbers behind it and have them in place for checking any future changes that would affect performance around this. I made #5595 to add JMH benchmarks.

@shakuzen
Copy link
Member

Sharing results from my MacBook Pro M1 with the benchmarks in the linked PR with 10,000 active samples and stopping a random sample. As expected, start is slower but the overall time to start and stop on average (with a random sample) is better.

Before

Benchmark                                   Mode  Cnt   Score   Error  Units
DefaultLongTaskTimerBenchmark.start           ss  200   0.495 ± 0.064  us/op
DefaultLongTaskTimerBenchmark.startAndStop    ss  200  15.351 ± 2.508  us/op
DefaultLongTaskTimerBenchmark.stopRandom      ss  200  14.784 ± 2.584  us/op

After

Benchmark                                   Mode  Cnt  Score   Error  Units
DefaultLongTaskTimerBenchmark.start           ss  200  1.002 ± 0.116  us/op
DefaultLongTaskTimerBenchmark.startAndStop    ss  200  6.338 ± 0.577  us/op
DefaultLongTaskTimerBenchmark.stopRandom      ss  200  5.154 ± 0.608  us/op

@fogninid fogninid force-pushed the long_task_timer_performance branch from 295d518 to 44f147a Compare October 16, 2024 12:05
Copy link
Member

@jonatan-ivanov jonatan-ivanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR, I really like this.
Not necessarily in this PR but I' also wondering if we should add a JCStress tests on top of #5595.

@jonatan-ivanov jonatan-ivanov changed the title improve average performance of long task timer for out-of-order stopping Improve average performance of LongTaskTimer for out-of-order stopping Jan 30, 2025
@jonatan-ivanov jonatan-ivanov force-pushed the long_task_timer_performance branch from 9396e4d to 7333777 Compare January 30, 2025 23:55
@jonatan-ivanov
Copy link
Member

jonatan-ivanov commented Jan 31, 2025

Fyi: I polished the PR a bit and also resolved the comments above and rebased it to main (otherwise the build failed because of the Prometheus integration tests).

@jonatan-ivanov jonatan-ivanov merged commit 86bb750 into micrometer-metrics:main Jan 31, 2025
7 checks passed
@jonatan-ivanov
Copy link
Member

@fogninid Thank you for the contribution!

@shakuzen shakuzen changed the title Improve average performance of LongTaskTimer for out-of-order stopping Improve average performance of DefaultLongTaskTimer for out-of-order stopping Jan 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A general enhancement module: micrometer-core An issue that is related to our core module performance Issues related to general performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants