Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pool ActiveQuerys in the query stack #629

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Veykril
Copy link
Member

@Veykril Veykril commented Dec 13, 2024

ActiveQuery is 192 bytes and has a couple of collections within it (2 of which are not extract) that we can potentially re-use the allocations for. So pooling those instead of creating, pushing and popping the query from the stack might be beneficial to perf.

Seems like we have small perf improvements.

Based on top of #626 (required for this)

Copy link

netlify bot commented Dec 13, 2024

Deploy Preview for salsa-rs canceled.

Name Link
🔨 Latest commit f3f6cc4
🔍 Latest deploy log https://app.netlify.com/sites/salsa-rs/deploys/67793ca987707e00089d5dfe

Copy link

codspeed-hq bot commented Dec 13, 2024

CodSpeed Performance Report

Merging #629 will not alter performance

Comparing Veykril:veykril/push-vznumyusmzww (f3f6cc4) with master (6c0dd82)

Summary

✅ 9 untouched benchmarks

@Veykril Veykril changed the title Do not pass ownership of the QueryStack in `Runtime::block_on_or_un… Pool ActiveQuerys in the query stack Dec 13, 2024
@Veykril Veykril marked this pull request as ready for review December 14, 2024 10:50
@MichaReiser
Copy link
Contributor

Do you have any perf numbers from ra that confirm the improvement?

@Veykril
Copy link
Member Author

Veykril commented Dec 23, 2024

Not yet hence why I am fine with waiting on these PRs until salsa can be used for r-a as is so its easier for me to test

@Veykril Veykril force-pushed the veykril/push-vznumyusmzww branch from acbb79e to e9f723b Compare January 4, 2025 10:09
@Veykril
Copy link
Member Author

Veykril commented Jan 4, 2025

Huh that's odd, all I did was rebase on latest master and now we have a perf regression?

@Veykril Veykril force-pushed the veykril/push-vznumyusmzww branch 2 times, most recently from 74e58b4 to 8a90d94 Compare January 4, 2025 10:36
@Veykril Veykril force-pushed the veykril/push-vznumyusmzww branch from 8a90d94 to b3da8e1 Compare January 4, 2025 11:03
@Veykril
Copy link
Member Author

Veykril commented Jan 4, 2025

okay locally I see (compared to #626:

  • +2% on Mutating Inputs/new/InternedInput
  • -10% on Mutating Inputs/amortized/InternedInput
  • +19% on Mutating Inputs/new/Input
  • -10% on Mutating Inputs/amortized/Input
  • +9% on many_tracked_structs

@Veykril
Copy link
Member Author

Veykril commented Jan 4, 2025

Looking at the profiling graph in codspeed this regressions to come from different allocation behavior (within unrelated code), so I don't think this is a direct regression?

@MichaReiser
Copy link
Contributor

Codspeed can be flaky at times but 22% suggest that there's something wrong. The crossbeam_queue call now takes significantly longer... I think we should look into why

@Veykril
Copy link
Member Author

Veykril commented Jan 4, 2025

It can be flaky but this report has been consistent across my 3 or so rebases earlier. My PR does change allocation patterns, so we might just be getting unlucky with the allocator here now. The SegQueue itself doesn't look special, it merely allocates in blocks of 31 items + pointer.

…wind`

This commit changes `Edge` such that it no longer takes direct ownership of the query stack and instead keeps a lifetime erased mutable slice, an exclusive borrow and as such the ownership model does not change.
The caller now does have to uphold the safety invariant that the query stack borrow is life for the entire computation which is trivially achievable as the caller will block until the computation as done.
@Veykril Veykril force-pushed the veykril/push-vznumyusmzww branch from b3da8e1 to 05f5469 Compare January 4, 2025 13:37
@Veykril
Copy link
Member Author

Veykril commented Jan 4, 2025

Okay after fixing the benchmarks, this regression seems a lot more workable (and reasonable regarding my changes)

@Veykril Veykril force-pushed the veykril/push-vznumyusmzww branch from 05f5469 to f4d973b Compare January 4, 2025 13:49
@Veykril
Copy link
Member Author

Veykril commented Jan 4, 2025

Okay my latest drain changes are the culprit for that one. I do want to say that this PR will generally regress perf over our benches as none of them do multiple queries in succession (which is where this PR shines), they tend to do one or two which means this PR strictly does more allocation work than before.

@Veykril Veykril force-pushed the veykril/push-vznumyusmzww branch from f4d973b to f3f6cc4 Compare January 4, 2025 13:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants