Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nondeterminism encountered when writing metadata for ProvenanceMap #124306

Closed
tmandry opened this issue Apr 23, 2024 · 10 comments
Closed

Nondeterminism encountered when writing metadata for ProvenanceMap #124306

tmandry opened this issue Apr 23, 2024 · 10 comments
Labels
A-metadata Area: Crate metadata A-reproducibility Area: Reproducible / deterministic builds T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@tmandry
Copy link
Member

tmandry commented Apr 23, 2024

Fuchsia has a test that checks whether rustc output is deterministic, and we find that it fairly reliably shifts bytes around in the metadata output for the ProvenanceMap for one particular crate in our build.

I determined this by picking out some changes of d5bd169 to add RUSTC_FILE_ENCODER_PANIC_AT_OFFSET and examining the backtrace at a number of offsets that differ in one failure in the hexdump below.

  • All of them had the same backtrace.
  • Only the metadata differs, no object files differ.
  • The input files are the same between invocations in a single failure.
  • We haven't ruled out other sources of nondeterminism, like proc macros.
  • Across multiple failures, the difference has been showing up around the same content, but the diff is sometimes slightly different (another example). This could also be because of changes in the source between failures.

cc @RalfJung @oli-obk, are there any obvious places to look for sources of nondeterminism?

  16:     0x7f0271b0b45a - <rustc_serialize[a8b506de04f7fd31]::opaque::FileEncoder>::write_with::<10usize, <rustc_serialize[a8b506de04f7fd31]::opaque::FileEncoder as rustc_serialize[a8b506de04f7fd31]::serialize::Encoder>::emit_usize::{closure#0}>
                               at /code/rust/compiler/rustc_serialize/src/opaque.rs:182:13
  17:     0x7f0271b0b45a - <rustc_serialize[a8b506de04f7fd31]::opaque::FileEncoder as rustc_serialize[a8b506de04f7fd31]::serialize::Encoder>::emit_usize
                               at /code/rust/compiler/rustc_serialize/src/opaque.rs:261:5
  18:     0x7f0271b0b45a - <rustc_metadata[8455509b42e0011]::rmeta::encoder::EncodeContext as rustc_serialize[a8b506de04f7fd31]::serialize::Encoder>::emit_usize
                               at /code/rust/compiler/rustc_metadata/src/rmeta/encoder.rs:79:25
  19:     0x7f0271b0b45a - <[(rustc_abi[b2d2de2f8c08d260]::Size, rustc_middle[446d53f26f0bfa95]::mir::interpret::pointer::CtfeProvenance)] as rustc_serialize[a8b506de04f7fd31]::serialize::Encodable<rustc_metadata[8455509b42e0011]::rmeta::encoder::EncodeContext>>::encode
                               at /code/rust/compiler/rustc_serialize/src/serialize.rs:289:9
  20:     0x7f0271b30463 - <alloc[724ed728a9dac397]::vec::Vec<(rustc_abi[b2d2de2f8c08d260]::Size, rustc_middle[446d53f26f0bfa95]::mir::interpret::pointer::CtfeProvenance)> as rustc_serialize[a8b506de04f7fd31]::serialize::Encodable<rustc_metadata[8455509b42e0011]::rmeta::encoder::EncodeContext>>::encode
                               at /code/rust/compiler/rustc_serialize/src/serialize.rs:298:9
  21:     0x7f0271b30463 - <rustc_data_structures[c994a8422b3c017]::sorted_map::SortedMap<rustc_abi[b2d2de2f8c08d260]::Size, rustc_middle[446d53f26f0bfa95]::mir::interpret::pointer::CtfeProvenance> as rustc_serialize[a8b506de04f7fd31]::serialize::Encodable<rustc_metadata[8455509b42e0011]::rmeta::encoder::EncodeContext>>::encode
                               at /code/rust/compiler/rustc_data_structures/src/sorted_map.rs:19:55
  22:     0x7f0271b30463 - <rustc_middle[446d53f26f0bfa95]::mir::interpret::allocation::provenance_map::ProvenanceMap as rustc_serialize[a8b506de04f7fd31]::serialize::Encodable<rustc_metadata[8455509b42e0011]::rmeta::encoder::EncodeContext>>::encode
                               at /code/rust/compiler/rustc_middle/src/mir/interpret/allocation/provenance_map.rs:38:14
  23:     0x7f0271b30463 - <rustc_middle[446d53f26f0bfa95]::mir::interpret::allocation::Allocation as rustc_serialize[a8b506de04f7fd31]::serialize::Encodable<rustc_metadata[8455509b42e0011]::rmeta::encoder::EncodeContext>>::encode
                               at /code/rust/compiler/rustc_middle/src/mir/interpret/allocation.rs:75:32
  24:     0x7f02719cc0e3 - <rustc_metadata[8455509b42e0011]::rmeta::encoder::EncodeContext>::encode_crate_root::{closure#17}
                               at /code/rust/compiler/rustc_metadata/src/rmeta/encoder.rs:657:21
  25:     0x7f02719cc0e3 - <rustc_metadata[8455509b42e0011]::rmeta::encoder::EncodeContext>::encode_crate_root
                               at /code/rust/compiler/rustc_metadata/src/rmeta/encoder.rs:641:37
  26:     0x7f02719d7309 - rustc_metadata[8455509b42e0011]::rmeta::encoder::encode_metadata
                               at /code/rust/compiler/rustc_metadata/src/rmeta/encoder.rs:2266:16
  27:     0x7f0271b34f68 - rustc_metadata[8455509b42e0011]::fs::encode_and_write_metadata
                               at /code/rust/compiler/rustc_metadata/src/fs.rs:65:13
  28:     0x7f026d2f2687 - rustc_interface[adc9e1219fb02f17]::passes::start_codegen
                               at /code/rust/compiler/rustc_interface/src/passes.rs:929:44
  29:     0x7f026d333f5c - <rustc_interface[adc9e1219fb02f17]::queries::Queries>::codegen_and_build_linker::{closure#0}
                               at /code/rust/compiler/rustc_interface/src/queries.rs:229:35

Lines are from compiler commit 7f2fc33. Source code is available here.
Full backtrace

--- obj/src/starnix/kernel/libstarnix_core.rlib.llvm-objdump.local	2024-04-22 16:14:34.523061172 +0000
+++ obj/src/starnix/kernel/libstarnix_core.rlib.llvm-objdump.remote	2024-04-22 16:14:51.439162227 +0000
@@ -355421,55 +355421,55 @@
  56c560 52d526d5 1ddc7edc 2ba5a001 e0a001a3  R.&...~.+.......
  56c570 a001e315 cb268525 a646930f d336a775  .....&.%.F...6.u
  56c580 c4558a13 b908ee24 c562c72a c4601502  .U.....$.b.*.`..
- 56c590 00000002 c4070100 01b7b904 01000012  ................
- 56c5a0 4d656d6f 72794d61 6e616765 723a3a6d  MemoryManager::m
- 56c5b0 6170c101 00020002 0002c001 0101000f  ap..............
- 56c5c0 010002ab c802000a 45727228 6572726e  ........Err(errn
- 56c5d0 6f29c100 00018daf 04000002 c4070200  o)..............
- 56c5e0 01b7b904 0200019f 8bdb0200 00020001  ................
+ 56c590 00000002 c4070100 01b7b904 0100000a  ................
+ 56c5a0 45727228 6572726e 6f29c101 00020002  Err(errno)......
+ 56c5b0 0002c001 0101000f 010002ab c802018d  ................
+ 56c5c0 af040000 02c40702 0000124d 656d6f72  ...........Memor
+ 56c5d0 794d616e 61676572 3a3a6d61 70c10000  yManager::map...
+ 56c5e0 01b7b904 0200019f 8bdb0202 00020001  ................
  56c5f0 0002c001 020002c4 07000001 8daf0401  ................
- 56c600 01000b00 8531018d af040200 01c78bdb  .....1..........
- 56c610 02010001 c78bdb02 020001b7 b9040001  ................
- 56c620 000100e1 31019f8b db020200 02c00100  ....1...........
- 56c630 01000402 aac80215 00277a78 3a3a7379  .........'zx::sy
- 56c640 733a3a7a 785f7468 72656164 5f737461  s::zx_thread_sta
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Apr 23, 2024
@saethlin
Copy link
Member

saethlin commented Apr 23, 2024

Is it possible to dig out which AllocIds are in the unstable ProvenanceMap? That could help point to the const that's producing them which might help identify if this is a build script or proc macro.

@saethlin saethlin added A-metadata Area: Crate metadata T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. A-reproducibility Area: Reproducible / deterministic builds labels Apr 23, 2024
@RalfJung
Copy link
Member

RalfJung commented Apr 23, 2024 via email

@tmandry
Copy link
Member Author

tmandry commented Apr 23, 2024

I'm growing suspicious that this isn't the ProvenanceMap at all, because the nondeterministic bytes contain strings, and there doesn't seem to be anything in the encoding that would imply that.

Moreover, I grepped the source for any strings I saw, and there was a common thread. Every one of them was in a doc link inside of a doc comment, like: [`std::ops::Deref`]. MemoryManager::map is not even a real function; it only appears inside a doc link.

The strings I searched for:

  • MemoryManager::map
  • zx::sys::zx_thread_state_general_regs_t
  • std::ops::Deref
  • Err(errno)

@saethlin
Copy link
Member

I'm growing suspicious that this isn't the ProvenanceMap at all,

That's probably the imprecision that I was worried about 😭

@oli-obk
Copy link
Contributor

oli-obk commented Apr 24, 2024

(I have no idea how allocid are stored in metadata so not sure if this is evem relevant.)

they aren't stored. We store their GlobalAlloc and regenerate new ids on loading from metadata.

I'm growing suspicious that this isn't the ProvenanceMap at all, because the nondeterministic bytes contain strings, and there doesn't seem to be anything in the encoding that would imply that.

Since we encode the actual allocation instead of an id, you will see strings in there if the const allocations constain strings (e.g. due to being part of string literals or something from formatting/panicking infrastructure).

@oli-obk
Copy link
Contributor

oli-obk commented Apr 24, 2024

I don't see how ProvenanceMap can be at fault though, as it's just a wrapper around SortedMap<u64, ...> effectively (https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/interpret/allocation/provenance_map/struct.ProvenanceMap.html).

But, if the layout of whatever Rust type (from your code, not from the compiler) is stored here is changing, then the allocation would change, too.

@RalfJung
Copy link
Member

RalfJung commented Apr 24, 2024 via email

@oli-obk
Copy link
Contributor

oli-obk commented Apr 24, 2024

edit (as the scheme changed over the years)

We create a table of allocations during encoding and make all alloc ids be indices into that table.

@RalfJung
Copy link
Member

Okay so they are still some sort of ID, but separate from the AllocId we see in the in-memory data structure.

@tmandry
Copy link
Member Author

tmandry commented Apr 25, 2024

After some fixes to the RUSTC_FILE_ENCODER_PANIC_AT_OFFSET implementation I'm closing this in favor of #124357, which has the correct backtrace. Sorry for the noise.

@tmandry tmandry closed this as completed Apr 25, 2024
@tmandry tmandry closed this as not planned Won't fix, can't repro, duplicate, stale Apr 25, 2024
@saethlin saethlin removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Apr 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-metadata Area: Crate metadata A-reproducibility Area: Reproducible / deterministic builds T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

5 participants