-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rustc_metadata: replace Entry table with one table for each of its fields (AoS -> SoA). #59953
Conversation
r? @zackmdavis (rust_highfive has picked a reviewer for you, use r? to override) |
r? @michaelwoerister cc @Zoxc @nnethercote @bors try |
rustc_metadata: replace Entry table with one table for each of its fields (AoS -> SoA). In #59789 (comment) I noticed that for many cross-crate queries (e.g. `predicates_of(def_id)`), we were deserializing the `rustc_metadata::schema::Entry` for `def_id` *only* to read one field (i.e. `predicates`). But there are several such queries, and `Entry` is not particularly small (in terms of number of fields, the encoding itself is quite compact), so there is a large (and unnecessary) constant factor. This PR replaces the (random-access) array¹ of `Entry` structures ("AoS"), with many separate arrays¹, one for each field that used to be in `Entry` ("SoA"), resulting in the ability to read individual fields separately, with negligible time overhead (in thoery), and some size overhead (as these arrays are not sparse). For stage1 `libcore`'s metadata blob, the size overhead is `8.44%`, and I have another commit (not initially included in this PR because I want to do perf runs with both) that brings it down to `5.88%`. ¹(in the source, these arrays are called "tables", but perhaps they could use a better name)
☀️ Try build successful - checks-travis |
@rust-timer build 3919374 |
Success: Queued 3919374 with parent 0085672, comparison URL. |
Finished benchmarking try commit 3919374 |
Performance looks good except for |
Can you please be more specific with your performance comments? Instruction counts for everything look great, including |
@Zoxc Oh, wow, that's a huge gap between instructions and cycles. Also, I've pushed the extra commit I mentioned in the PR description, let's see what that does! @bors try |
rustc_metadata: replace Entry table with one table for each of its fields (AoS -> SoA). *Based on top of #59887* In #59789 (comment) I noticed that for many cross-crate queries (e.g. `predicates_of(def_id)`), we were deserializing the `rustc_metadata::schema::Entry` for `def_id` *only* to read one field (i.e. `predicates`). But there are several such queries, and `Entry` is not particularly small (in terms of number of fields, the encoding itself is quite compact), so there is a large (and unnecessary) constant factor. This PR replaces the (random-access) array¹ of `Entry` structures ("AoS"), with many separate arrays¹, one for each field that used to be in `Entry` ("SoA"), resulting in the ability to read individual fields separately, with negligible time overhead (in thoery), and some size overhead (as these arrays are not sparse). In a way, the new approach is closer to incremental on-disk caches, which store each query's cached results separately, but it would take significantly more work to unify the two. For stage1 `libcore`'s metadata blob, the size overhead is `8.44%`, and I have another commit (not initially included in this PR because I want to do perf runs with both) that brings it down to `5.88%`. ¹(in the source, these arrays are called "tables", but perhaps they could use a better name)
☀️ Try build successful - checks-travis |
@rust-timer build 0403760 |
Success: Queued 0403760 with parent 60076bb, comparison URL. |
Finished benchmarking try commit 0403760 |
Thanks a lot for the PR, @eddyb! Looks like a nice improvement. |
Keep in mind most commits are refactors that don't change the encoded metadata (or do so without changing its size). |
@Zoxc looks like the latest numbers are better? |
It's possible, |
Ugh, these numbers are poisoned by the huge delta in 0085672...60076bb. I'll have to open another PR for the version without the last commit, and start the try builds at the same time... |
⌛ Trying commit d89dddc with merge b2a5ec95e0c4044dafb3cc99fcf71c2db186bb42... |
☀️ Try build successful - checks-azure |
@rust-timer build b2a5ec95e0c4044dafb3cc99fcf71c2db186bb42 |
Queued b2a5ec95e0c4044dafb3cc99fcf71c2db186bb42 with parent 437ca55, future comparison URL. |
Finished benchmarking try commit b2a5ec95e0c4044dafb3cc99fcf71c2db186bb42, comparison URL. |
(Note that the total is compressed with XZ, but also includes |
📌 Commit d89dddc has been approved by |
@bors rollup=never |
rustc_metadata: replace Entry table with one table for each of its fields (AoS -> SoA). In #59789 (comment) I noticed that for many cross-crate queries (e.g. `predicates_of(def_id)`), we were deserializing the `rustc_metadata::schema::Entry` for `def_id` *only* to read one field (i.e. `predicates`). But there are several such queries, and `Entry` is not particularly small (in terms of number of fields, the encoding itself is quite compact), so there is a large (and unnecessary) constant factor. This PR replaces the (random-access) array¹ of `Entry` structures ("AoS"), with many separate arrays¹, one for each field that used to be in `Entry` ("SoA"), resulting in the ability to read individual fields separately, with negligible time overhead (in thoery), and some size overhead (as these arrays are not sparse). In a way, the new approach is closer to incremental on-disk caches, which store each query's cached results separately, but it would take significantly more work to unify the two. For stage1 `libcore`'s metadata blob, the size overhead is `8.44%`, and I have another commit (~~not initially included because I want to do perf runs with both~~ **EDIT**: added it now) that brings it down to `5.88%`. ¹(in the source, these arrays are called "tables", but perhaps they could use a better name)
☀️ Test successful - checks-azure |
…ables, r=michaelwoerister rustc_metadata: use a table for super_predicates, fn_sig, impl_trait_ref. This is an attempt at a part of rust-lang#65407, i.e. moving parts of cross-crate "metadata" into tables that match queries more closely. Three new tables should be enough to see some perf/metadata size changes. (need to do something similar to rust-lang#59953 (comment)) There are other bits of data that could be made into tables, but they can be more compact so the impact would likely be not as bad, and they're also more work to set up.
…ables, r=michaelwoerister rustc_metadata: use a table for super_predicates, fn_sig, impl_trait_ref. This is an attempt at a part of rust-lang#65407, i.e. moving parts of cross-crate "metadata" into tables that match queries more closely. Three new tables should be enough to see some perf/metadata size changes. (need to do something similar to rust-lang#59953 (comment)) There are other bits of data that could be made into tables, but they can be more compact so the impact would likely be not as bad, and they're also more work to set up.
…ables, r=michaelwoerister rustc_metadata: use a table for super_predicates, fn_sig, impl_trait_ref. This is an attempt at a part of rust-lang#65407, i.e. moving parts of cross-crate "metadata" into tables that match queries more closely. Three new tables should be enough to see some perf/metadata size changes. (need to do something similar to rust-lang#59953 (comment)) There are other bits of data that could be made into tables, but they can be more compact so the impact would likely be not as bad, and they're also more work to set up.
…imulacrum rustc_metadata: simplify the interactions between Lazy and Table. These are small post-rust-lang#59953 cleanups (including undoing some contrivances from that PR). r? @michaelwoerister
In #59789 (comment) I noticed that for many cross-crate queries (e.g.
predicates_of(def_id)
), we were deserializing therustc_metadata::schema::Entry
fordef_id
only to read one field (i.e.predicates
).But there are several such queries, and
Entry
is not particularly small (in terms of number of fields, the encoding itself is quite compact), so there is a large (and unnecessary) constant factor.This PR replaces the (random-access) array¹ of
Entry
structures ("AoS"), with many separate arrays¹, one for each field that used to be inEntry
("SoA"), resulting in the ability to read individual fields separately, with negligible time overhead (in thoery), and some size overhead (as these arrays are not sparse).In a way, the new approach is closer to incremental on-disk caches, which store each query's cached results separately, but it would take significantly more work to unify the two.
For stage1
libcore
's metadata blob, the size overhead is8.44%
, and I have another commit (not initially included because I want to do perf runs with bothEDIT: added it now) that brings it down to5.88%
.¹(in the source, these arrays are called "tables", but perhaps they could use a better name)