Skip to content

gcc's lto breaks rust's jemalloc/llvm interfaces #30178

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
MagaTailor opened this issue Dec 2, 2015 · 26 comments
Closed

gcc's lto breaks rust's jemalloc/llvm interfaces #30178

MagaTailor opened this issue Dec 2, 2015 · 26 comments
Labels
A-linkage Area: linking into static, shared libraries and binaries

Comments

@MagaTailor
Copy link

As an experiment I had -flto in my CFLAGS while building rustc which didn't cause any problems.
(Dec 2 nightly, i686-linux, gcc 4.9.3, local llvm 3.6)

It seems the compiler is fully functional but still I'm getting segfaults like this (in other build commands too):

 Compiling gl v0.0.12
failed to run custom build command for `gl v0.0.12`
Process didn't exit successfully: `/home/petevine/unpacked/rust-snake-master/target/release/build/gl-a90cb7fe89544061/build-script-build` (signal: 11)

Program received signal SIGSEGV, Segmentation fault.
0x8007d629 in je_mallocx ()
(gdb) bt
#0  0x8007d629 in je_mallocx ()
#1  0x800890f6 in __rust_allocate ()
#2  0x00000000 in ?? ()

If it's not a good idea to use lto flags at build-time, configure should definitely filter them out or at least warn the user.

EDIT:
This issue is not about artifacts as it can be reproduced with empty .cargo and fresh project directories.

@MagaTailor
Copy link
Author

After repeating the whole procedure with jemalloc disabled the resulting compiler generates fully working code meaning it was just jemalloc that got miscompiled with lto.

@MagaTailor MagaTailor changed the title Rustc built with gcc's lto causing runtime jemalloc segfaults gcc's lto miscompiles jemalloc Dec 4, 2015
@sfackler
Copy link
Member

sfackler commented Dec 4, 2015

This seems like a bug that should be filed on jemalloc?

@MagaTailor
Copy link
Author

Probably yes; meanwhile what about rust's build system? The compiler builds without a slightest problem and even generates a working hello world! (as well as manages to build some crates)

@MagaTailor
Copy link
Author

The jemalloc maintainter didn't give a definitive answer to my last question but it's almost certain now it wasn't about jemalloc per se. Back to you guys :)

@pnkfelix
Copy link
Member

I suspect you're going to have issues with people saying "there's nothing actionable here" on both the jemalloc side and the Rust side...

@MagaTailor
Copy link
Author

Let's recap:

  • jemalloc compiled separately with -flto passes all tests
  • disabling jemalloc allows -flto
  • not using -flto is necessary for a fully functional compiler w/jemalloc enabled.

At least that was the case two weeks ago.

@MagaTailor
Copy link
Author

After the recent C purge I can no longer reproduce.
(or it could have been the -flto-partition=none flag I used this time)

@MagaTailor
Copy link
Author

Well no, still happens sometimes:

failed to run custom build command for `kernel32-sys v0.2.1`
Process didn't exit successfully: `/home/petevine/exa-master/target/release/build/kernel32-sys-49d5fc958bd6fccd/build-script-build` (signal: 11)

or

failed to run custom build command for `openssl-sys v0.7.1`
Process didn't exit successfully: `/home/petevine/exa-master/target/release/build/openssl-sys-3f949b2c9b71dbaa/build-script-build` (signal: 11)

which yields the same trace:

Program received signal SIGSEGV, Segmentation fault.
0x800167ee in je_malloc_tsd_boot0.part ()
(gdb) bt
#0  0x800167ee in je_malloc_tsd_boot0.part ()
#1  0x80016657 in je_malloc_tsd_boot0 ()
#2  0x80039975 in malloc_init_hard.lto_priv ()
#3  0x800383fe in jemalloc_constructor ()
#4  0x8004f4f6 in __libc_csu_init ()
#5  0xb7dec237 in __libc_start_main () from /lib/libc.so.6
#6  0x8000aa81 in _start () at ../sysdeps/i386/start.S:102

@MagaTailor MagaTailor reopened this Dec 25, 2015
@pnkfelix
Copy link
Member

Can you provide a full transcript of the steps you are following to exhibit this behavior? If will probably need both your configure invocation (for rustc) and whatever remaining steps you used to exhibit the bug

Independently, since I am inferring that these stack traces are coming from invocations of your locally built rustc, maybe you should run make check on your local build rather than just "hello world", in the hopes that we find a smaller test input that reproduces the problem.

@MagaTailor
Copy link
Author

No, I'd already stated clearly it was all about CFLAGS (-flto) when building rustc. So it should either reproduce anywhere or it's a bug in gcc. The last part about hello world is rather funny as the gdb traces were coming from regular projects' cargo builds.

@pnkfelix
Copy link
Member

@petevine I do not understand your response to my suggestion. you have reopened the issue here, so I infer that you are not actually suggesting that you believe this to be a bug in gcc, but continuing to do full cargo runs rather than first double checking against the rust test site just seems like you don't think the problem can be narrowed down?

@pnkfelix
Copy link
Member

Oh I see, you really are not building rustc locally at all?

@MagaTailor
Copy link
Author

I was, how else would I be testing a rustc compiled with those flags?

Rustc is able to compile crates just fine (and bootstrap itself) but the crashes happen in the build-script-build invocations. I doubt make check would yield any new info and I won't be doing another build for a while anyway.

@pnkfelix
Copy link
Member

I am just trying to understand your comments and infer why you would not run make check

@MagaTailor
Copy link
Author

I didn't keep the build directory and I'm not going to devote the old i686 hardware I rarely use to just building/testing rust. Use a buildbot, Luke!

@MagaTailor MagaTailor changed the title gcc's lto miscompiles jemalloc gcc's lto breaks rust's jemalloc interface Jan 2, 2016
@MagaTailor
Copy link
Author

After overcoming issue #30688 I was finally able to build rustc on arm (gcc 4.9, local llvm 3.6), and guess what @pnkfelix, make check is completely "useless" just as I'd predicted, yet here we are trying to use a newly-built bfc binary:

gdb --args target/release/bfc sample_programs/hello_world.bf

Program received signal SIGSEGV, Segmentation fault.
0x7f642784 in thread_rng::h22bece718b8c83adkgf ()
(gdb) bt
#0  0x7f642784 in thread_rng::h22bece718b8c83adkgf ()
#1  0x7f6422dc in util::tmpname::h63004644db7838bePja ()
#2  0x7f641a00 in named::NamedTempFile::new::hfa7543c1e647f61ecqa ()
#3  0x7f61c308 in compile_file::h04704d1240bcd17a3Fc ()
#4  0x7f62077c in main::h68a433d56cae34167Pc ()
#5  0x7f65ab4c in panic::recover::h13621087687085827578 ()
#6  0x7f65a68c in rt::lang_start::h5fc8517878d759f3Dky ()
#7  0xb6d61632 in __libc_start_main (main=0x7f621dbc <main>, argc=2, argv=0xbeffef74, init=<optimized out>, fini=0x8004b0f9 <__libc_csu_fini>, 
    rtld_fini=0xb6fea4c5 <_dl_fini>, stack_end=0xbeffef74) at libc-start.c:287
#8  0x7f61a8d8 in _start ()

Note the absence of anything related to jemalloc meaning other interfaces can be affected too.

@MagaTailor MagaTailor changed the title gcc's lto breaks rust's jemalloc interface gcc's lto breaks rust's jemalloc/llvm interfaces Jan 4, 2016
@pnkfelix
Copy link
Member

pnkfelix commented Jan 5, 2016

@petevine did you capture a transcript during any of these runs? It might help others who are trying to properly reproduce the problem

@MagaTailor
Copy link
Author

You mean a build log or sth else?

@pnkfelix
Copy link
Member

pnkfelix commented Jan 5, 2016

@petevine something with the sequence of commands, starting from your configure invocation (and I guess the result of invoking env).

The output from the invocations is good to include too (which would indeed make it a build log). But the most important thing is to document all the environment options and configure switches.

@pnkfelix
Copy link
Member

pnkfelix commented Jan 5, 2016

@petevine (FYI in case you were wondering, people often put such transcripts into a gist and then link to the gist from a comment, so that the comment thread does not get innundated with the wall of text that such a log can take up. and that way there is no problem with including the output from the command invocations, which is usually easier to grab via cut-and-paste...)

@MagaTailor
Copy link
Author

Indeed - I have a little suspicion it could be related to linking against local llvm. The funny thing about this latest arm build is a few other binaries created with it seem to work flawlessly. In this case it must be really subtle.

I propose setting up a travis job that would build everything (llvm!) from scratch using -flto -ffat-lto-objects, preferably with gcc 5.3 or newer. We might immediately know if there's any use in further pursuing this.

@huonw huonw added the A-build label Jan 6, 2016
@pnkfelix
Copy link
Member

pnkfelix commented Jan 6, 2016

Any chance of relation to #28066 ?

perhaps not since portions of this bug are gcc specific but I thought it best to cross reference the two

@pnkfelix
Copy link
Member

see also #26541 which may or may not be related.

@pnkfelix pnkfelix added the A-linkage Area: linking into static, shared libraries and binaries label Jan 12, 2016
@MagaTailor
Copy link
Author

Even though llvm's LTO is disabled during rust builds it's rather probable a similar problem could be exposed by LTO in either toolchain.

About the ARM build that almost works - the segfaulting binary is different from all the others in that it calls the system llvm tools. Does that give you any ideas?

EDIT:
What if the crash is unrelated and I'd actually found a new bug? Bingo! But it's still related to jemalloc.

The unofficial arm build from yesterday produces the crashing binary as well (I never build with jemalloc on ARM so my own build is free of this issue) which is why it was easy to blame -flto originally. Not on ARM it seems!

@MagaTailor
Copy link
Author

There's an additional adverse effect of using LTO - rustc linked with a newer gcc >= 5 doesn't work on a system with gcc 4.9 due to link errors:

Compiling minifb v0.2.7
error: linking with `cc` failed: exit code: 1
note: lto1: fatal error: bytecode stream generated with LTO version 4.0 instead of the expected 3.0

@MagaTailor
Copy link
Author

The downgraded jemalloc seems to have fixed this issue on x86 as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-linkage Area: linking into static, shared libraries and binaries
Projects
None yet
Development

No branches or pull requests

4 participants