Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve ASAN-related symbols during LTO. #114946

Merged
merged 1 commit into from
Sep 6, 2023

Conversation

anforowicz
Copy link
Contributor

Fixes #113404

@rustbot
Copy link
Collaborator

rustbot commented Aug 17, 2023

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @compiler-errors (or someone else) soon.

Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review and S-waiting-on-author) stays updated, invoking these commands when appropriate:

  • @rustbot author: the review is finished, PR author should check the comments and take action accordingly
  • @rustbot review: the author is ready for a review, this PR will be queued again in the reviewer's queue

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Aug 17, 2023
@anforowicz
Copy link
Contributor Author

/cc @tmiasko

@rust-log-analyzer

This comment has been minimized.

@anforowicz anforowicz force-pushed the generic-fix-for-asan-lto branch 2 times, most recently from 72e4552 to 2c75640 Compare August 17, 2023 21:13
@compiler-errors
Copy link
Member

sorry, I don't really have enough context to review this.

r? compiler

@anforowicz
Copy link
Contributor Author

Maybe this PR can be reviewed by @tmiasko (who has previously reviewed the abandoned fix attempt/direction at #114642 and has been involved in the discussion in #113404)?

@anforowicz anforowicz force-pushed the generic-fix-for-asan-lto branch from 2c75640 to f287205 Compare August 21, 2023 16:24
@nikic
Copy link
Contributor

nikic commented Aug 21, 2023

Can you please also test -C lto=thin in this and all other relevant tests? I feel like it should break because you're only changing full LTO internalization, while thin LTO also does this in a different way.

@anforowicz
Copy link
Contributor Author

Can you please also test -C lto=thin in this and all other relevant tests?

Good point. Done, sort-of - I've added -C lto=thin flavour to the new address-sanitizer-globals-tracking.rs test and to the existing memory-track-origins.rs test.

I don't know how to add this coverage to sanitizer-recover.rs. When I try doing this, then FileCheck complains that it Could not open input file '/usr/local/google/home/lukasza/src/github/rust/build/x86_64-unknown-linux-gnu/test/codegen/sanitizer/sanitizer-recover.MSAN-RECOVER-THIN-LTO/sanitizer-recover.ll': No such file or directory. I don't see any rustc errors (would I need to pass extra ./x.py or rustc flags to see them?). I see that this test doesn't specify #![crate_type="lib"] nor #![crate_type="staticlib"] - not sure if this would matter (as a consequence this test also doesn't specify -C prefer-dynamic=false).

I also felt unable to tweak 4053e25#diff-4c2e2f96156d5aa76e678ae72c4006d62ec03e05a546330d8ea98275ac9b9f67 to also apply to thin LTO...

Hopefully the coverage above is good enough?

I feel like it should break because you're only changing full LTO internalization, while thin LTO also does this in a different way.

I think that thin LTO should be safe, because AFAIU the problem here was internalization (inlining/hiding global symbols into local symbols). And it seems to me that thin LTO can't really do such thing (because it needs to preserve global symbols in case they are used by other CGUs).

@rust-log-analyzer

This comment has been minimized.

@anforowicz
Copy link
Contributor Author

#114946 (comment): I don't know how to add this coverage to sanitizer-recover.rs. When I try doing this, then FileCheck complains

I've tried running ./x -v ... and it seems that the problem was: ignoring emit path because multiple .ll files were produced.

I don't understand why there are multiple .ll files:

$ ls .../github/rust/build/x86_64-unknown-linux-gnu/test/codegen/sanitizer/sanitizer-recover.MSAN-RECOVER-THIN-LTO
sanitizer-recover.2f39cus7tyzltjwu.rcgu.ll
sanitizer-recover.57laxi1n5rnsv40z.rcgu.ll
sanitizer-recover.addr2line-63498f7ec81b51f2.addr2line.74ba4853a214e382-cgu.0.rcgu.o.rcgu.ll
sanitizer-recover.adler-2e72a29cbb2fad9b.adler.a59ddefc5a607a39-cgu.0.rcgu.o.rcgu.ll
sanitizer-recover.alloc-42fb4edbc8124fd6.alloc.65c1c8b47eaec200-cgu.0.rcgu.o.rcgu.ll
sanitizer-recover.alloc-42fb4edbc8124fd6.alloc.65c1c8b47eaec200-cgu.1.rcgu.o.rcgu.ll
sanitizer-recover.alloc-42fb4edbc8124fd6.alloc.65c1c8b47eaec200-cgu.2.rcgu.o.rcgu.ll
sanitizer-recover.alloc-42fb4edbc8124fd6.alloc.65c1c8b47eaec200-cgu.3.rcgu.o.rcgu.ll
sanitizer-recover.alloc-42fb4edbc8124fd6.alloc.65c1c8b47eaec200-cgu.4.rcgu.o.rcgu.ll
sanitizer-recover.cfg_if-29d93783939fbc45.cfg_if.46e0b2051d9f73ec-cgu.0.rcgu.o.rcgu.ll
...

#114946 (comment): Could not open input file '.../test/codegen/sanitizer/address-sanitizer-globals-tracking.ASAN-THIN-LTO/address-sanitizer-globals-tracking.ll': No such file or directory

Not sure why this doesn't repro locally. Maybe I should try to debug with llvm-15, starting by reading https://rustc-dev-guide.rust-lang.org/tests/docker.html?highlight=docker#testing-with-docker

FWIW I feel a bit lost here. Any advice how to proceed here?

@anforowicz
Copy link
Contributor Author

FWIW it seems that the only already-existing ThinLTO test under tests/codegen/ is issue-81408-dllimport-thinlto-windows.rs (which is restricted to only-windows).

@anforowicz
Copy link
Contributor Author

Let me also try discussing this in t-compiler/help on Zulip: lto=thin + test/codegen = multiple-.ll-files problem

@anforowicz anforowicz force-pushed the generic-fix-for-asan-lto branch 2 times, most recently from 61880a4 to 3f7804e Compare August 24, 2023 21:24
@rust-log-analyzer

This comment has been minimized.

@anforowicz anforowicz force-pushed the generic-fix-for-asan-lto branch from 3f7804e to 9465277 Compare August 24, 2023 23:02
@rust-log-analyzer

This comment has been minimized.

@anforowicz
Copy link
Contributor Author

Status update: I think this issue cannot be reprod using tests/ui. Let me try to explain below.


I believe that to repro the issue we need staticlib output from rustc:

  • If I use #![crate_type = "rlib"] in issue-113404-lib1.rs and extract (using ar) the object file, then build/x86_64-unknown-linux-gnu/llvm/bin/llvm-nm -gU issue-113404-lib1.issue_113404_lib1.209bbc80fa7d2e2c-cgu.0.rcgu.o includes C ___asan_globals_registered
  • If I use #![crate_type = "staticlib"] then llvm-nm output does not include ___asan_globals_registered

But staticlib seems incompatible with tests/ui - the following tests/ui/lto/issue-113404.rs fails to build when issue-113404-lib1.rs is a staticlib:

```
// aux-build: issue-113404-lib1.rs
extern crate issue_113404_lib1 as lib1;
```

So, I guess the next step is to try to author the new test under tests/run-make.

@anforowicz anforowicz force-pushed the generic-fix-for-asan-lto branch from 9465277 to f582c2c Compare August 25, 2023 17:33
@anforowicz
Copy link
Contributor Author

RE: @nikic: #114946 (comment): ThinLTO test coverage

I am unable to provide such test coverage. In #113404 (comment) I am:

  • Asking for help with finding a minimal repro (that may potentially be used in tests/run-make)
  • Asking for discussion about whether ThinLTO test coverage is a hard requirement for landing this PR

@bjorn3
Copy link
Member

bjorn3 commented Aug 25, 2023

With this change won't we use a linker script that hides all those symbols? The linker script is generated from exported_symbols from which you remove all those symbols:

fn exported_symbols_for_non_proc_macro(tcx: TyCtxt<'_>, crate_type: CrateType) -> Vec<String> {
For staticlib this currently doesn't apply as rustc currently is unable to hide any exports for staticlibs, but this needs to be fixed to allow linking multiple rust staticlibs into the same dylib/executable.

@anforowicz
Copy link
Contributor Author

RE: @bjorn3: #114946 (comment):

With this change won't we use a linker script that hides all those symbols? The linker script is generated from exported_symbols from which you remove all those symbols:

fn exported_symbols_for_non_proc_macro(tcx: TyCtxt<'_>, crate_type: CrateType) -> Vec<String> {

For staticlib this currently doesn't apply as rustc currently is unable to hide any exports for staticlibs, but this needs to be fixed to allow linking multiple rust staticlibs into the same dylib/executable.

I agree that the PR here impacts what gets returned from fn exported_symbols (i.e. after this PR things like __msan_track_origins will no longer be included). Sadly, I don't understand what you mean by "linker script" above (this is probably just my incompetence and unfamiliarity with linkers - I am just a simple Chromium developer... :-( ). I am aware of version scripts (e.g. Chromium passes -Wl,--version-script=..../chrome.map here) - maybe linker script just another name for a version script?

At any rate, the paragraph I wrote above may not matter much. I think that ultimately you are pointing out that this PR may introduce a regression into rustc, and if so, then it seems desirable/important to prevent the regression (now and in the future) by adding a new test (if it is possible and not too difficult).

  • I assume that the existing tests would catch the regression? (FWIW ./x test passes on my local development machine).
  • Could you please help with authoring such tests/ui or tests/run-make test? If you don't have time to try authoring such tests yourself (which I would appreciate very much :-), then maybe you can outline the setup that you think would give a repro (e.g. what crates / linking / etc would the test need to use)? IIUC you are saying that the problem would happen with fat LTO + that dynamic linking might be required for a repro + that the repro might not require ASAN (e.g. maybe a simpler repro can be created with MSAN)?

My apologies if I mixed up anything (as I said, I am not a linker expert) - I would appreciate your help with this.

@anforowicz
Copy link
Contributor Author

And I think that you are saying that before this PR symbols like __msan_track_origins would be exported out of .dlls / .sos but they would no longer be exported after this PR. Did I get this right?

@bjorn3
Copy link
Member

bjorn3 commented Aug 26, 2023

Yes, when rustc does the linking with this PR the version script would hide those symbols.

@anforowicz
Copy link
Contributor Author

@rustbot author (switching to the approach in #114642 based on feedback)

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 28, 2023
@anforowicz
Copy link
Contributor Author

RE: @bjorn3: #114946 (comment):

Yes, when rustc does the linking with this PR the version script would hide those symbols.

Ack. Let me try to rephrase in my own words to double-check my understanding.

AFAIU the trouble is when fn linker_with_args in compiler/rustc_codegen_ssa/src/back/link.rs calls Linker::export_symbols:

// If we're building something like a dynamic library then some platforms
// need to make sure that all symbols are exported correctly from the
// dynamic library.
// Must be passed before any libraries to prevent the symbols to export from being thrown away,
// at least on some platforms (e.g. windows-gnu).
cmd.export_symbols(
tmpdir,
crate_type,
&codegen_results.crate_info.exported_symbols[&crate_type],
);

I think you are saying that (always? on some platforms only?) the PR will affect the linker command line generated by rustc (in one of impls of Linker::export_symbols) and so the PR will prevent __asan_globals_registered (and/or __msan_track_origins, __llvm_profile_filename, etc.) from being exported out of dynamically-linked libraries (e.g. .dlls / .sos).


But then __asan_globals_registered is common hidden global and LLVM Reference Manual seems to say that hidden objects are not meant to be exported out of dynamically-linked libraries:

Two declarations of an object with hidden visibility refer to the same object if they are in the same shared object. Usually, hidden visibility indicates that the symbol will not be placed into the dynamic symbol table, so no other module (executable or shared library) can reference it directly.

So maybe your concern doesn't necessarily apply to __asan_globals_registered, because it doesn't need to be exported? Can you please comment on that?


OTOH, I am not sure if the previous section (if correct at all) helps with other symbols. Maybe listing their LLVM-IR attributes will help us make some progress here:

  • __asan_globals_registered is common hidden global i64 (when running the new tests/codegen/sanitizer/address-sanitizer-globals-tracking.rs on Linux host)
  • __msan_track_origins is weak_odr constant i32 1 (when running tests/codegen/sanitizer/memory-track-origins.rs on a Linux host)
  • __msan_keep_going is weak_odr constant i32 1 (when running tests/codegen/sanitizer/sanitizer-recover.rs on Linux host)
  • __llvm_profile_filename = hidden local_unnamed_addr constant [19 x i8] c"default_%m.profraw\00", comdat (when running tests/codegen/pgo-instrumentation.rs on Linux host)
  • __llvm_profile_raw_version = comdat any

If the concern still applies to the other symbols, then maybe this PR can be tweaked so that it only touches __asan_globals_registered?

@bjorn3
Copy link
Member

bjorn3 commented Aug 29, 2023

I think you are saying that (always? on some platforms only?) the PR will affect the linker command line generated by rustc (in one of impls of Linker::export_symbols) and so the PR will prevent __asan_globals_registered (and/or __msan_track_origins, __llvm_profile_filename, etc.) from being exported out of dynamically-linked libraries (e.g. .dlls / .sos).

Correct.

But then __asan_globals_registered is common hidden global and LLVM Reference Manual seems to say that hidden objects are not meant to be exported out of dynamically-linked libraries:

I see. In that case adapting just the LTO code would work for now I think.

@anforowicz anforowicz force-pushed the generic-fix-for-asan-lto branch from f582c2c to e6dddbd Compare August 29, 2023 19:03
@anforowicz
Copy link
Contributor Author

anforowicz commented Aug 29, 2023

@bjorn3 / @nikic / @tmiasko - could you PTAL again?

The latest shape of this PR:

  • Doesn't touch compiler/rustc_codegen_ssa/src/back/symbol_export.rs and therefore doesn't affect MSAN-related or PGO-related symbols.
  • Only special-cases ___asan_globals_registered in PreserveFunctions in compiler/rustc_llvm/llvm-wrapper/PassWrapper.cpp.

Additional notes:

  • Why the fix touches PassWrapper.cpp instead of symbol_export.rs
  • Why the PR only adds a regression test in tests/codegen/sanitizer/address-sanitizer-globals-tracking.rs
    • This level of testing has been okay for other similar fixes (e.g. tests/codegen/sanitizer/sanitizer-recover.rs and tests/codegen/sanitizer/memory-track-origins.rs)
    • We concluded that ThinLTO testing is not possible via tests/codegen tests (see a Zulip discussion)
    • The PR author was unable to create an end-to-end repro of the problem using tests/run-make or tests/ui. This may be possible, but currently we don't have such test.
    • The PR only affects ___asan_globals_registered (which is already broken as tracked in __asan_globals_registered is not comdat when building a staticlib with LTO #113404)

@rustbot review

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Aug 29, 2023
@anforowicz
Copy link
Contributor Author

r? @tmiasko (since you've reviewed #114642 in the past; feel free to reassign it though)

@rustbot rustbot assigned tmiasko and unassigned wesleywiser Aug 31, 2023
@tmiasko
Copy link
Contributor

tmiasko commented Sep 6, 2023

@bors r+ rollup=never

@bors
Copy link
Contributor

bors commented Sep 6, 2023

📌 Commit e6dddbd has been approved by tmiasko

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 6, 2023
@bors
Copy link
Contributor

bors commented Sep 6, 2023

⌛ Testing commit e6dddbd with merge e3abbd4...

@bors
Copy link
Contributor

bors commented Sep 6, 2023

☀️ Test successful - checks-actions
Approved by: tmiasko
Pushing e3abbd4 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 6, 2023
@bors bors merged commit e3abbd4 into rust-lang:master Sep 6, 2023
@rustbot rustbot added this to the 1.74.0 milestone Sep 6, 2023
@anforowicz anforowicz deleted the generic-fix-for-asan-lto branch September 6, 2023 23:20
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (e3abbd4): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.6% [0.4%, 2.3%] 3
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.2% [-2.2%, -2.2%] 1
All ❌✅ (primary) 1.6% [0.4%, 2.3%] 3

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-1.0% [-1.0%, -1.0%] 1
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 628.332s -> 630.181s (0.29%)
Artifact size: 317.91 MiB -> 317.88 MiB (-0.01%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

__asan_globals_registered is not comdat when building a staticlib with LTO
10 participants