Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wasm: Store rlib metadata in wasm object files #120588

Merged
merged 1 commit into from
Feb 21, 2024

Conversation

alexcrichton
Copy link
Member

The goal of this commit is to remove warnings using LLVM tip-of-tree wasm-ld. In llvm/llvm-project#78658 the wasm-ld LLD driver no longer looks at archive indices and instead looks at all the objects in archives. Previously lib.rmeta files were simply raw rustc metadata bytes, not wasm objects, meaning that wasm-ld would emit a warning indicating so.

WebAssembly targets previously passed --fatal-warnings to wasm-ld by default which meant that if Rust were to update to LLVM 18 then all wasm targets would not work. This immediate blocker was resolved in #120278 which removed --fatal-warnings which enabled a theoretical update to LLVM 18 for wasm targets. This current state is ok-enough for now because rustc squashes all linker output by default if it doesn't fail. This means, for example, that rustc squashes all the linker warnings coming out of wasm-ld about lib.rmeta files with LLVM 18. This again isn't a pressing issue because the information is all hidden, but it runs the risk of being annoying if another linker error were to happen and then the output would have all these unrelated warnings that couldn't be fixed.

Thus, this PR comes into the picture. The goal of this PR is to resolve these warnings by using the WebAssembly object file format on wasm targets instead of using raw rustc metadata. When I first implemented the rlib-in-objects scheme in #84449 I remember either concluding that wasm-ld would either include the metadata in the output or I thought we didn't have to do anything there at all. I think I was wrong on both counts as wasm-ld does not include the metadata in the final output unless the object is referenced and we do actually need to do something to resolve these warnings.

This PR updates the object file format containing rustc metadata on WebAssembly targets to be an actual WebAssembly file. To avoid bringing in any new dependencies I've opted to hand-code this encoding at this time. If the object gets more complicated though it'd probably be best to pull in wasmparser and wasm-encoder. For now though there's two adjacent functions reading/writing wasm.

The only caveat I know of with this is that if wasm-ld does indeed look at the object file then the metadata will be included in the final output. I believe the only thing that could cause that at this time is --whole-archive which I don't think is passed for rlibs. I would clarify that I'm not 100% certain about this, however.

@rustbot
Copy link
Collaborator

rustbot commented Feb 2, 2024

r? @wesleywiser

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Feb 2, 2024
@nikic
Copy link
Contributor

nikic commented Feb 2, 2024

Is it possible to integrate this in create_object_file using the object crate? I believe it supports wasm as well.

@bjorn3
Copy link
Member

bjorn3 commented Feb 2, 2024

This again isn't a pressing issue because the information is all hidden

#119286 will show it.

@bjorn3
Copy link
Member

bjorn3 commented Feb 2, 2024

Is it possible to integrate this in create_object_file using the object crate? I believe it supports wasm as well.

The object crate only supports parsing wasm, not emitting it. It also doesn't support relocatable wasm object files, only linked wasm modules. For this particular case it may still work as nothing is encoded in the linking section anyway.

@alexcrichton
Copy link
Member Author

Is it possible to integrate this in create_object_file using the object crate? I believe it supports wasm as well.

Oh good point, I did not know!

The object crate only supports parsing wasm, not emitting it.

Ah yes this appears to be the case.

It also looks like the support in object uses the wasmparser crate when enabled, and the wasm support in object isn't enabled by default. I've gone ahead and enabled the wasm feature which removes the need for special logic when reading the metadata section.

I'll also add some notes in the documentation about this for the writing half.

@rustbot
Copy link
Collaborator

rustbot commented Feb 9, 2024

These commits modify the Cargo.lock file. Unintentional changes to Cargo.lock can be introduced when switching branches and rebasing PRs.

If this was unintentional then you should revert the changes before this PR is merged.
Otherwise, you can ignore this comment.

@rust-log-analyzer

This comment has been minimized.

@alexcrichton
Copy link
Member Author

I've pushed an update to tidy checks to allow wasmparser, but if that's not something I should be doing here please let me know and I'd be ok restorign the older version which doesn't use the wasm feature of object which pulls in the wasmparser crate

@rustbot rustbot added A-testsuite Area: The testsuite used to check the correctness of rustc T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) labels Feb 9, 2024
@rustbot
Copy link
Collaborator

rustbot commented Feb 9, 2024

The list of allowed third-party dependencies may have been modified! You must ensure that any new dependencies have compatible licenses before merging.

cc @davidtwco, @wesleywiser

@@ -15,6 +15,7 @@ const LICENSES: &[&str] = &[
"Apache-2.0 / MIT",
"Apache-2.0 OR MIT",
"Apache-2.0 WITH LLVM-exception OR Apache-2.0 OR MIT", // wasi license
"Apache-2.0 WITH LLVM-exception", // wasmparser license
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding the license here, could you add an exception for this crate?

const EXCEPTIONS: ExceptionList = &[

Sorry to be nitpicky but I think this would help prevent us from depending on more Apache 2 only code without realizing it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh no that definitely makes sense. I've moved it down there.

One point though is that I had to add an extra exception to the runtime checks loop. The object dependency is shared between rustc/libstd and the code here "isn't smart enough" to determine that libstd doesn't actually depend on wasmparser where rustc does, so I added some comments to that effect.

I'm also going to talk to some folks to try to get wasmparser relicensed with Apache-2.0 WITH LLVM-exception OR Apache-2.0 OR MIT which should resolve these concerns

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and the code here "isn't smart enough" to determine that libstd doesn't actually depend on wasmparser where rustc does

Yet another thing that would be fixed by moving library/ to a separate cargo workspace...

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 9, 2024
src/tools/tidy/src/deps.rs Outdated Show resolved Hide resolved
@wesleywiser
Copy link
Member

Thanks Alex!

@bors r+

@bors
Copy link
Contributor

bors commented Feb 9, 2024

📌 Commit 4f3348d has been approved by wesleywiser

It is now in the queue for this repository.

@bors bors removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Feb 9, 2024
alexcrichton added a commit to alexcrichton/wasm-tools that referenced this pull request Feb 10, 2024
Instead substitute this with `AtomicUsize` which panics on overflow to
catch any possible issues on 32-bit platforms. This is motivated by
rust-lang/rust#120588 which is pulling
`wasmparser` into the Rust compiler and `AtomicU64` is not available on
all the platforms the compiler itself is built for.

I plan on backporting this to the `0.118.x` release track as well which
is what's used by the `object` crate currently to avoid the need for a
whole bunch of cascading updates.
alexcrichton added a commit to bytecodealliance/wasm-tools that referenced this pull request Feb 12, 2024
Instead substitute this with `AtomicUsize` which panics on overflow to
catch any possible issues on 32-bit platforms. This is motivated by
rust-lang/rust#120588 which is pulling
`wasmparser` into the Rust compiler and `AtomicU64` is not available on
all the platforms the compiler itself is built for.

I plan on backporting this to the `0.118.x` release track as well which
is what's used by the `object` crate currently to avoid the need for a
whole bunch of cascading updates.
alexcrichton added a commit to alexcrichton/wasm-tools that referenced this pull request Feb 12, 2024
Instead substitute this with `AtomicUsize` which panics on overflow to
catch any possible issues on 32-bit platforms. This is motivated by
rust-lang/rust#120588 which is pulling
`wasmparser` into the Rust compiler and `AtomicU64` is not available on
all the platforms the compiler itself is built for.

I plan on backporting this to the `0.118.x` release track as well which
is what's used by the `object` crate currently to avoid the need for a
whole bunch of cascading updates.
alexcrichton added a commit to bytecodealliance/wasm-tools that referenced this pull request Feb 12, 2024
Instead substitute this with `AtomicUsize` which panics on overflow to
catch any possible issues on 32-bit platforms. This is motivated by
rust-lang/rust#120588 which is pulling
`wasmparser` into the Rust compiler and `AtomicU64` is not available on
all the platforms the compiler itself is built for.

I plan on backporting this to the `0.118.x` release track as well which
is what's used by the `object` crate currently to avoid the need for a
whole bunch of cascading updates.
@alexcrichton alexcrichton requested a review from bjorn3 February 12, 2024 14:54
@alexcrichton
Copy link
Member Author

For that failure bytecodealliance/wasm-tools#1412 was applied to the main branch of the wasmparser crate which I've backported to the 0.118.x release which is what the object crate is using, so this should be good for another try after updating rustc to using that version of the crate.

@alexcrichton
Copy link
Member Author

Could I perhaps ping this to request a re-queue with bors?

The goal of this commit is to remove warnings using LLVM tip-of-tree
`wasm-ld`. In llvm/llvm-project#78658 the `wasm-ld` LLD driver no longer
looks at archive indices and instead looks at all the objects in
archives. Previously `lib.rmeta` files were simply raw rustc metadata
bytes, not wasm objects, meaning that `wasm-ld` would emit a warning
indicating so.

WebAssembly targets previously passed `--fatal-warnings` to `wasm-ld` by
default which meant that if Rust were to update to LLVM 18 then all wasm
targets would not work. This immediate blocker was resolved in
rust-lang#120278 which removed `--fatal-warnings` which enabled a
theoretical update to LLVM 18 for wasm targets. This current state is
ok-enough for now because rustc squashes all linker output by default if
it doesn't fail. This means, for example, that rustc squashes all the
linker warnings coming out of `wasm-ld` about `lib.rmeta` files with
LLVM 18. This again isn't a pressing issue because the information is
all hidden, but it runs the risk of being annoying if another linker
error were to happen and then the output would have all these unrelated
warnings that couldn't be fixed.

Thus, this PR comes into the picture. The goal of this PR is to resolve
these warnings by using the WebAssembly object file format on wasm
targets instead of using raw rustc metadata. When I first implemented
the rlib-in-objects scheme in rust-lang#84449 I remember either concluding that
`wasm-ld` would either include the metadata in the output or I thought
we didn't have to do anything there at all. I think I was wrong on both
counts as `wasm-ld` does not include the metadata in the final output
unless the object is referenced and we do actually need to do something
to resolve these warnings.

This PR updates the object file format containing rustc metadata on
WebAssembly targets to be an actual WebAssembly file. This enables the
`wasm` feature of the `object` crate to be able to read the custom
section in the same manner as other platforms, but currently `object`
doesn't support writing wasm object files so a handwritten encoder is
used instead.

The only caveat I know of with this is that if `wasm-ld` does indeed
look at the object file then the metadata will be included in the final
output. I believe the only thing that could cause that at this time is
`--whole-archive` which I don't think is passed for rlibs. I would
clarify that I'm not 100% certain about this, however.
@bjorn3
Copy link
Member

bjorn3 commented Feb 20, 2024

@bors r=wesleywiser,bjorn3

@bors
Copy link
Contributor

bors commented Feb 20, 2024

📌 Commit 6181f3a has been approved by wesleywiser,bjorn3

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Feb 20, 2024
@bors
Copy link
Contributor

bors commented Feb 21, 2024

⌛ Testing commit 6181f3a with merge 7168c13...

@bors
Copy link
Contributor

bors commented Feb 21, 2024

☀️ Test successful - checks-actions
Approved by: wesleywiser,bjorn3
Pushing 7168c13 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Feb 21, 2024
@bors bors merged commit 7168c13 into rust-lang:master Feb 21, 2024
12 checks passed
@rustbot rustbot added this to the 1.78.0 milestone Feb 21, 2024
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (7168c13): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
3.1% [3.0%, 3.3%] 4
Regressions ❌
(secondary)
1.5% [0.2%, 2.9%] 33
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 3.1% [3.0%, 3.3%] 4

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
3.5% [2.3%, 4.9%] 3
Improvements ✅
(primary)
-2.4% [-2.4%, -2.4%] 1
Improvements ✅
(secondary)
-4.8% [-6.4%, -3.2%] 2
All ❌✅ (primary) -2.4% [-2.4%, -2.4%] 1

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 640.751s -> 651.363s (1.66%)
Artifact size: 308.59 MiB -> 310.99 MiB (0.78%)

@rustbot rustbot added the perf-regression Performance regression. label Feb 21, 2024
@alexcrichton alexcrichton deleted the wasm-rmeta-object branch February 21, 2024 15:24
@alexcrichton
Copy link
Member Author

My best guess as to the source of the regression is that the wasm feature is now enabled in the object crate which adds an extra variant to much of its arms-of-things. The size of object::File doesn't change when toggling the wasm feature nor does std::mem::needs_drop, so even though this is my best guess I'm not certain why it would contribute towards a regression...

@rylev
Copy link
Member

rylev commented Feb 27, 2024

@alexcrichton I'm doing perf triage. Unfortunately, the regression here is significant enough to warrant some thought. The main concern are the regressions in the hello-world primary benchmark which has admittedly been a bit noisy lately. However, the regressions here do appear to be legitimate when compared to historical data.

The self profile comparison's on the results page don't really show anything conclusive either. 2 possible steps forward:

  • Open a PR reverting the changes and run a perf test on that to make sure that we see the predicated opposite perf improvement. If not, noise becomes much more likely of a explanation.
  • Run self-profiling locally and see if anything interesting pops out (you can find the directions at the top of the page here).

@alexcrichton
Copy link
Member Author

Sounds good! I'll give the local testing a spin soon and failing that open a PR for a revert and measure that.

@alexcrichton
Copy link
Member Author

Ok I've done local profiling and the results have yielded:

--------------------------------------------------------------------------------
-- File:function summary
--------------------------------------------------------------------------------
  Ir_______  file:function

< 1,249,595  ./elf/./elf/dl-lookup.c:
  1,026,935    _dl_lookup_symbol_x
    186,449    do_lookup_x
     36,211    check_match

<   126,032  ./elf/./elf/dl-addr.c:_dl_addr

<    66,269  ./elf/../sysdeps/x86_64/dl-machine.h:_dl_relocate_object

<    24,144  ./elf/./elf/do-rel.h:_dl_relocate_object

<    11,734  ./string/../sysdeps/x86_64/strcmp.S:strcmp

<     6,651  ./elf/../sysdeps/generic/dl-protected.h:do_lookup_x

<     5,912  ./elf/../sysdeps/generic/ldsodefs.h:
      3,695    do_lookup_x
      2,217    _dl_relocate_object

This gives rise to another hypothesis. I do not know the intricacies of the dynamic loader on Linux, but I can say for certain that on the previous commit before this PR librustc_driver had 14293 symbols (as reported by nm -g) and afterwards had 15206 symbols. This is because wasmparser, the crate pulled in here, has a lot of functions (e.g. a function-per-wasm-instruction and there's a lot of those). Almost none of those symbols are needed by rustc, but if the build process hasn't changed for librustc_driver since last I checked it pessimistically exported all crates in case any users of librustc_driver.so ended up needing them.

So my hypothesis is that this is because wasmparser has a bunch of symbols that rustc does not use. I don't know precisely if that would affect do_lookup_x but if I had to hazard a guess I would assume that _dl_lookup_symbol_x is probably doing a sort of linear search and may be needing to skip more symbols than before. That's just a guess though as I don't know how the dynamic loader works at a low-level.

I think it's probably safe to say that this isn't noise. As for what to do:

  • I'm happy to remove the wasmparser dependency here from this PR, it'll just involve adding a wad of parsing code and a bunch of special cases for wasm.
  • I don't know if there's a way to make these symbols private, if there is I'd be happy to make a PR to do that and see if the perf improves!

Or something else, happy to do what y'all think is best here! I'm also happy to take this discussion to Zulip as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-testsuite Area: The testsuite used to check the correctness of rustc merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants