Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

debug_loc info is (slightly) non-deterministic #45397

Closed
glandium opened this issue Oct 19, 2017 · 16 comments
Closed

debug_loc info is (slightly) non-deterministic #45397

glandium opened this issue Oct 19, 2017 · 16 comments
Labels
A-debuginfo Area: Debugging information in compiled programs (DWARF, PDB, etc.) A-reproducibility Area: Reproducible / deterministic builds C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@glandium
Copy link
Contributor

I've been comparing Firefox build on Mozilla CI with and without sccache, and after having eliminated all the expected differences, there was one remaining in the resulting binary that ended up being caused by rust code. That difference was in the build-id of libxul.so, as well as the checksum in the gnu_debuglink section. Both are influenced by the contents of debug sections. I repeated the comparisons with 2 builds without sccache and got the same discrepancy.

Further analysis revealed that the root difference lies in the debug_loc data in gkrust-b23623c450cfcda2.0.o, that seems to be related to the _ZN5style10properties10LonghandId11parse_value17heed0466ee2fc256eE symbol (style::properties::LonghandId::parse_value). That function is generated by a python script, but I validated that the generated source that produced the different object files was identical.

I can provide the two .o files I've been comparing, but they are each 200MB large (or about 27MB when compressed with zstd) so I don't know where to put them.

The "slightly" in the bug summary is because, compared to the size of those files, the differences are rather small. I'll additionally note that dwarfdump doesn't like those .o files and fails with:

dwarfdump ERROR:  reference form with no valid local ref?!, offset=<0x00050377>:  DW_DLE_ATTR_FORM_OFFSET_BAD (119)

I'll do another comparison run with LTO disabled, which hopefully would produce smaller .o files.

Cc: @michaelwoerister @luser @froydnj

@est31
Copy link
Member

est31 commented Oct 20, 2017

cc @infinity0
cc #34902

@glandium
Copy link
Contributor Author

On a non-lto build, I get differences in style-3a3501e81c38f395.0.o, which is 103MB. FWIW, I also get differences in multiple rust.metadata.bin files within some rlibs, but they don't matter for the Firefox build differences (on libfutures, libgeckoservo, libgkrust_shared, libnserror and libstyle ; all other rlibs are identical).

@glandium
Copy link
Contributor Author

So interestingly, dwarfdump doesn't barf on the non-lto object file. The diff in dwarfdump output looks like:

11518c11518
<                             DW_AT_location              <loclist at offset 0x00407f1c with 1 entries follows>
---
>                             DW_AT_location              <loclist at offset 0x00407f1b with 1 entries follows>
11528c11528
<                               DW_AT_location              <loclist at offset 0x00407f3f with 1 entries follows>
---
>                               DW_AT_location              <loclist at offset 0x00407f3e with 1 entries follows>
11546c11546
<                             DW_AT_location              <loclist at offset 0x00407f62 with 1 entries follows>
---
>                             DW_AT_location              <loclist at offset 0x00407f61 with 1 entries follows>
11556c11556
<                               DW_AT_location              <loclist at offset 0x00407f87 with 1 entries follows>
---
>                               DW_AT_location              <loclist at offset 0x00407f86 with 1 entries follows>
11573c11573
<                             DW_AT_location              <loclist at offset 0x00407fac with 1 entries follows>
---
>                             DW_AT_location              <loclist at offset 0x00407fab with 1 entries follows>
11583c11583
<                               DW_AT_location              <loclist at offset 0x00407fd1 with 1 entries follows>
---
>                               DW_AT_location              <loclist at offset 0x00407fd0 with 1 entries follows>
11595c11595
<                             DW_AT_location              <loclist at offset 0x00407ff6 with 2 entries follows>
---
>                             DW_AT_location              <loclist at offset 0x00407ff5 with 2 entries follows>
11600c11600
<                             DW_AT_location              <loclist at offset 0x00408036 with 1 entries follows>
---
>                             DW_AT_location              <loclist at offset 0x00408035 with 1 entries follows>
11607c11607
<                               DW_AT_location              <loclist at offset 0x0040805e with 1 entries follows>
---
>                               DW_AT_location              <loclist at offset 0x0040805d with 1 entries follows>
11611c11611
<                               DW_AT_location              <loclist at offset 0x00408086 with 1 entries follows>
---
>                               DW_AT_location              <loclist at offset 0x00408085 with 1 entries follows>

There are a lot more of those, but if we ignore all the DW_AT_location differences, there is this:

@@ -3771924,27 +3771924,27 @@
                             DW_AT_type                  <0x00bb46c2>
 < 4><0x00942006>          DW_TAG_formal_parameter
                             DW_AT_name                  sink
                             DW_AT_decl_file             0x00000087 /builds/worker/workspace/build/src/servo/components/selectors/parser.rs
                             DW_AT_decl_line             0x00000001
                             DW_AT_type                  <0x00bb4578>
 < 4><0x00942011>          DW_TAG_lexical_block
 < 5><0x00942012>            DW_TAG_variable
-                              DW_AT_name                  input
+                              DW_AT_name                  parser
                               DW_AT_alignment             0x00000001
                               DW_AT_decl_file             0x00000087 /builds/worker/workspace/build/src/servo/components/selectors/parser.rs
                               DW_AT_decl_line             0x00000472
-                              DW_AT_type                  <0x00bb46c2>
+                              DW_AT_type                  <0x00bc01a2>
 < 5><0x0094201f>            DW_TAG_variable
-                              DW_AT_name                  parser
+                              DW_AT_name                  input
                               DW_AT_alignment             0x00000001
                               DW_AT_decl_file             0x00000087 /builds/worker/workspace/build/src/servo/components/selectors/parser.rs
                               DW_AT_decl_line             0x00000472
-                              DW_AT_type                  <0x00bc01a2>
+                              DW_AT_type                  <0x00bb46c2>
 < 5><0x0094202c>            DW_TAG_variable
                               DW_AT_name                  sink
                               DW_AT_alignment             0x00000001
                               DW_AT_decl_file             0x00000087 /builds/worker/workspace/build/src/servo/components/selectors/parser.rs
                               DW_AT_decl_line             0x00000472
                               DW_AT_type                  <0x00bb4578>
 < 5><0x00942039>            DW_TAG_lexical_block
 < 6><0x0094203a>              DW_TAG_variable

Then further down:

@@ -3777651,22 +3777651,22 @@
                        [ 1] range entry    0x0000077c 0x00000a33
                        [ 2] range entry    0x00000b36 0x00000b50
                        [ 3] range entry    0x0000139c 0x00001450
                        [ 4] range entry    0x000014df 0x0000155f
                        [ 5] range entry    0x00001683 0x000017d3
                        [ 6] range entry    0x0000227b 0x00002293
                        [ 7] range end      0x00000000 0x00000000
 <11><0x00945f2c>                        DW_TAG_variable
-                                          DW_AT_location              <loclist at offset 0x0036af3b with 1 entries follows>
-                       [ 0]< offset pair low-off : 0x0000077c addr  0x0000077c high-off  0x000022ac addr 0x000022ac>DW_OP_breg6-64
+                                          DW_AT_location              <loclist at offset 0x0036af3a with 1 entries follows>
+                       [ 0]< offset pair low-off : 0x0000077c addr  0x0000077c high-off  0x000022ac addr 0x000022ac>DW_OP_breg6-96
                                           DW_AT_abstract_origin       <0x00942012>
 <11><0x00945f35>                        DW_TAG_variable
                                           DW_AT_location              <loclist at offset 0x0036af5f with 1 entries follows>
-                       [ 0]< offset pair low-off : 0x0000077c addr  0x0000077c high-off  0x000022ac addr 0x000022ac>DW_OP_breg6-96
+                       [ 0]< offset pair low-off : 0x0000077c addr  0x0000077c high-off  0x000022ac addr 0x000022ac>DW_OP_breg6-64
                                           DW_AT_abstract_origin       <0x0094201f>
 <11><0x00945f3e>                        DW_TAG_variable
                                           DW_AT_abstract_origin       <0x0094202c>
 <11><0x00945f43>                        DW_TAG_inlined_subroutine
                                           DW_AT_abstract_origin       <0x007fcfd8>
                                           DW_AT_low_pc                0x000007a9
                                           DW_AT_high_pc               <offset-from-lowpc>35
                                           DW_AT_call_file             0x00000087 /builds/worker/workspace/build/src/servo/components/selectors/parser.rs

Then at some point there's also:

                ranges: 3 at .debug_ranges offset 359984 (0x00057e30) (48 bytes)
                        [ 0] range entry    0x00000622 0x00000880
                        [ 1] range entry    0x00000e2e 0x00000e3a
                        [ 2] range end      0x00000000 0x00000000
 <29><0x009d0a25>                                                            DW_TAG_variable
-                                                                              DW_AT_location              <loclist at offset 0x00153187 with 5 entries fol
lows>
+                                                                              DW_AT_location              <loclist at offset 0x00153172 with 6 entries fol
lows>
                        [ 0]< offset pair low-off : 0x00000609 addr  0x00000609 high-off  0x00000616 addr 0x00000616>DW_OP_reg17
-                       [ 1]< offset pair low-off : 0x0000061b addr  0x0000061b high-off  0x0000062d addr 0x0000062d>DW_OP_reg17
-                       [ 2]< offset pair low-off : 0x0000062d addr  0x0000062d high-off  0x0000062d addr 0x0000062d>DW_OP_breg6-64
-                       [ 3]< offset pair low-off : 0x0000062d addr  0x0000062d high-off  0x0000062d addr 0x0000062d>DW_OP_reg17
-                       [ 4]< offset pair low-off : 0x0000062d addr  0x0000062d high-off  0x00000e46 addr 0x00000e46>DW_OP_breg6-64
+                       [ 1]< offset pair low-off : 0x00000616 addr  0x00000616 high-off  0x0000061b addr 0x0000061b>DW_OP_breg6-64
+                       [ 2]< offset pair low-off : 0x0000061b addr  0x0000061b high-off  0x0000062d addr 0x0000062d>DW_OP_reg17
+                       [ 3]< offset pair low-off : 0x0000062d addr  0x0000062d high-off  0x0000062d addr 0x0000062d>DW_OP_breg6-64
+                       [ 4]< offset pair low-off : 0x0000062d addr  0x0000062d high-off  0x0000062d addr 0x0000062d>DW_OP_reg17
+                       [ 5]< offset pair low-off : 0x0000062d addr  0x0000062d high-off  0x00000e46 addr 0x00000e46>DW_OP_breg6-64
                                                                               DW_AT_abstract_origin       <0x008f60de>
 <29><0x009d0a2e>                                                            DW_TAG_lexical_block
                                                                               DW_AT_ranges                0x00057e00

which is actually in .debug_ranges, not .debug_loc.

And

 < 5><0x00a3953b>            DW_TAG_lexical_block
                               DW_AT_low_pc                0x00000025
                               DW_AT_high_pc               <offset-from-lowpc>5736
 < 6><0x00a39548>              DW_TAG_variable
-                                DW_AT_location              <loclist at offset 0x00007053 with 9 entries follows>
+                                DW_AT_location              <loclist at offset 0x00007053 with 8 entries follows>
                        [ 0]< offset pair low-off : 0x00000014 addr  0x00000014 high-off  0x00000017 addr 0x00000017>DW_OP_reg4
                        [ 1]< offset pair low-off : 0x00000017 addr  0x00000017 high-off  0x00000065 addr 0x00000065>DW_OP_reg3
                        [ 2]< offset pair low-off : 0x000002b7 addr  0x000002b7 high-off  0x000002be addr 0x000002be>DW_OP_reg0
                        [ 3]< offset pair low-off : 0x000002be addr  0x000002be high-off  0x000004ac addr 0x000004ac>DW_OP_breg6-280
                        [ 4]< offset pair low-off : 0x000004ac addr  0x000004ac high-off  0x000004bb addr 0x000004bb>DW_OP_reg0
                        [ 5]< offset pair low-off : 0x000006b6 addr  0x000006b6 high-off  0x000006bd addr 0x000006bd>DW_OP_reg0
-                       [ 6]< offset pair low-off : 0x000006bd addr  0x000006bd high-off  0x00000a18 addr 0x00000a18>DW_OP_breg6-488
-                       [ 7]< offset pair low-off : 0x00000a18 addr  0x00000a18 high-off  0x00000be6 addr 0x00000be6>DW_OP_breg6-280
-                       [ 8]< offset pair low-off : 0x00000be6 addr  0x00000be6 high-off  0x0000168d addr 0x0000168d>DW_OP_breg6-488
+                       [ 6]< offset pair low-off : 0x000006bd addr  0x000006bd high-off  0x00000719 addr 0x00000719>DW_OP_breg6-488
+                       [ 7]< offset pair low-off : 0x00000719 addr  0x00000719 high-off  0x0000168d addr 0x0000168d>DW_OP_breg6-280
                                 DW_AT_name                  self
                                 DW_AT_alignment             0x00000001
                                 DW_AT_decl_file             0x0000005c /builds/worker/workspace/build/src/third_party/rust/cssparser/src/rules_and_declarations.rs
                                 DW_AT_decl_line             0x000000eb
                                 DW_AT_type                  <0x00bb479d>
 < 6><0x00a39558>              DW_TAG_inlined_subroutine
                                 DW_AT_abstract_origin       <0x00a09539>
                                 DW_AT_ranges                0x00002340

@kennytm kennytm added the A-debuginfo Area: Debugging information in compiled programs (DWARF, PDB, etc.) label Oct 20, 2017
@michaelwoerister
Copy link
Member

cc @tromey

@infinity0
Copy link
Contributor

infinity0 commented Oct 20, 2017

@glandium Did you build in the same build path, or try giving -Zremap-path-prefix-{from,to}? That should eliminate some differences. There might have been some further reproducibility regressions between 1.18 - 1.20 but I didn't get a chance to investigate yet, a test case would be good eventually too.

edit: looks like from one of your traces that you did build in the same build path, so I suppose the differences are due to something else.

@glandium
Copy link
Contributor Author

Same paths, same version of rust, same version of gcc, same version of everything. Essentially, I've been triggering the same Firefox CI build twice and compared the output.

@XAMPPRocky XAMPPRocky added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. C-bug Category: This is a bug. labels Feb 12, 2018
@jonas-schievink jonas-schievink added the A-reproducibility Area: Reproducible / deterministic builds label Aug 19, 2019
@fangism
Copy link

fangism commented Oct 21, 2021

Today, I also observe non-repeatability in the .debug_log section of rlibs.
Test: run once, backup outputs, run again, compare.

The example I was debugging happened to be https://crates.io/crates/miniz_oxide

@fangism
Copy link

fangism commented Oct 21, 2021

And here we see two pairs of symbols in different orders, from two runs.

tkdiff-dwarfdump-1
tkdiff-dwarfdump-2

@infinity0
Copy link
Contributor

might be related to #89911; you could try with rust 1.55 / llvm 12 and see if it reproduces?

@fangism
Copy link

fangism commented Oct 25, 2021

might be related to #89911; you could try with rust 1.55 / llvm 12 and see if it reproduces?

I tried with rustc-1.55 and clang/llvm-11.1.0 (which is what I had handy), and the result is perfectly deterministic and reproducible (experimented 20 times), no differences related to debug info.

@fangism
Copy link

fangism commented Oct 25, 2021

I ran the repeatbility experiment on the libminiz_oxide crate with --emit=llvm-ir using rustc 1.58.0-nightly (1af55d19c 2021-10-19), and found that the .ll output is perfectly deterministic. Does this mean the culprit can be narrowed down to LLVM (codegen) itself?

clang version 14.0.0 (https://llvm.googlesource.com/a/llvm-project e9b1c974be272ca51800cff2cd561f9e53eb127e)

@Fuuzetsu
Copy link

At least based on #90301, we can easily blame the LLVM update for some change.

@yanok
Copy link
Contributor

yanok commented Nov 9, 2021

In #90301 I have a fix. Would be nice to verify it fixes also this bug.

@yanok
Copy link
Contributor

yanok commented Nov 10, 2021

@fangism Your problem is likely fixed now (well, it will be in the next nightly build, otherwise you have to build rustc yourself), but please confirm.

@glandium Your original report predates the bug in LLVM I fixed, so most likely you will still see a problem. Could you make a reproducible case please?

@glandium
Copy link
Contributor Author

My original bug is probably long gone, as rust hasn't been a problem for Firefox reproducibility for a while (although we also now do cross-language LTO, so the real compilation is handled by clang's llvm).

@yanok
Copy link
Contributor

yanok commented Nov 11, 2021

So, should we close this bug then?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-debuginfo Area: Debugging information in compiled programs (DWARF, PDB, etc.) A-reproducibility Area: Reproducible / deterministic builds C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

10 participants