-
-
Notifications
You must be signed in to change notification settings - Fork 726
perf(codegen): further reduce memory allocations in generate_line_offset_tables
#13056
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(codegen): further reduce memory allocations in generate_line_offset_tables
#13056
Conversation
How to use the Graphite Merge QueueAdd either label to this PR to merge it via the merge queue:
You must have a Graphite account in order to use the merge queue. Sign up using this link. An organization admin has enabled the Graphite Merge Queue in this repository. Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue. This stack of pull requests is managed by Graphite. Learn more about stacking. |
CodSpeed Instrumentation Performance ReportMerging #13056 will not alter performanceComparing Summary
Footnotes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
Merge activity
|
…fset_tables` (#13056) #13054 added a nice optimization to `SourcemapBuilder`. During generation of line/offset tables, it reuses a single `Vec` for column indexes for each line, rather than creating a new `Vec` on each turn of the inner loop. This reduces the number of times that `Vec` may have to grow as column indexes get added to it. Take this optimization a step further by re-using the same `Vec` across *all* lines. `columns` `Vec` is not consumed on each line, but each time the contents are copied into a boxed slice - except when reaching EOF, where we can consume `columns`, as its work is done. This memory-copying was likely happening anyway, as `Vec<u32>` -> `Box<[u32]>` conversion has to drop the spare capacity of the `Vec`, which will likely cause a reallocation. Also, avoid using iterators to create the boxed slices. `Vec::clone` followed by `Vec::into_boxed_slice` is a bit more explicit and so may help compiler to see that it only needs to allocate exactly `columns.len()` slots for the `Box<[u32]>`. Note: I also tried `columns.drain(..).collect()` instead of `columns.clone().into_boxed_slice()` + `columns.clear()`. But it looks like the `Drain` abstraction doesn't get completely removed by compiler. https://godbolt.org/z/Trv47j4hP So I *think* `into_boxed_slice` is probably preferable.
12f143d to
1385c71
Compare

#13054 added a nice optimization to
SourcemapBuilder. During generation of line/offset tables, it reuses a singleVecfor column indexes for each line, rather than creating a newVecon each turn of the inner loop. This reduces the number of times thatVecmay have to grow as column indexes get added to it.Take this optimization a step further by re-using the same
Vecacross all lines.columnsVecis not consumed on each line, but each time the contents are copied into a boxed slice - except when reaching EOF, where we can consumecolumns, as its work is done.This memory-copying was likely happening anyway, as
Vec<u32>->Box<[u32]>conversion has to drop the spare capacity of theVec, which will likely cause a reallocation.Also, avoid using iterators to create the boxed slices.
Vec::clonefollowed byVec::into_boxed_sliceis a bit more explicit and so may help compiler to see that it only needs to allocate exactlycolumns.len()slots for theBox<[u32]>.Note: I also tried
columns.drain(..).collect()instead ofcolumns.clone().into_boxed_slice()+columns.clear(). But it looks like theDrainabstraction doesn't get completely removed by compiler. https://godbolt.org/z/Trv47j4hP So I thinkinto_boxed_sliceis probably preferable.