Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustc: build with system llvm and jemalloc #49557

Merged
merged 2 commits into from
Nov 12, 2018

Conversation

symphorien
Copy link
Member

Motivation for this change

We can use our llvm instead of the bundled one, just like fedora does: https://src.fedoraproject.org/rpms/rust/blob/master/f/rust.spec

The closure size goes down a little: 780MB -> 740MB, but now part of this closure is llvm and can be shared. This also benefits compilation time.

The commit also includes some cleanup of tests which no longer exist and don't need to be removed.

I tested compilation of alacritty, firefox, thunderbird and ripgrep on x86_64-linux. (on top of master, 9f88282)

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nox --run "nox-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Fits CONTRIBUTING.md.

@@ -29,7 +29,7 @@ in rec {
./patches/disable-test-inherit-env.patch
];

forceBundledLLVM = true;
withBundledLLVM = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LnL7 @dtzWill anyone who does remember why we switched away from this option? I thing this option was true at some point.

Copy link
Member

@Mic92 Mic92 Nov 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It broke the tests back then 7bd191d
It could be still ok though:

Since you deal with rust backports sometimes for firefox what would you prefer @andir ?
I guess in the worst case it should be still possible to just re-enable the bundled version in backported version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really have no strong opinion here. I would certainly like to avoid a situation where we have to port LLVM whenever we have to back port rustc for Thunderbird/Firefox.

How much is the speedup of compilation time? If it is significant I could see this being a good thing in general.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compiling rustc with the bundled llvm takes about 1h05 for llvm and 55 min for rustc. So this halves compilation time.
Note that fedora disables codegen tests when using the bundled llvm.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment we do not run tests anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is everything from their llvm fork upstream now? That was the main reason for this IIRC.

Copy link
Member

@Mic92 Mic92 Nov 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They only seem to merge pull requests from changes that have been before in llvm.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, I tried to split this out into 2 separate derivations at some point but not rebuilding llvm is even better.

@dtzWill
Copy link
Member

dtzWill commented Nov 1, 2018 via email

@Mic92
Copy link
Member

Mic92 commented Nov 3, 2018

I will test the aarch64 build. I also would like to see macOS tested.

@Mic92
Copy link
Member

Mic92 commented Nov 4, 2018

@periklis It would be interesting to see if this pull request would solve #49708 in a more general way on macOS (or at least work around).

@periklis
Copy link
Contributor

periklis commented Nov 5, 2018

@Mic92 I started a build on my darwin builder. I will report the results when done

@Mic92
Copy link
Member

Mic92 commented Nov 6, 2018

@danieldk maybe you can also have look at this w.r.t. to macOS.

@danieldk
Copy link
Contributor

danieldk commented Nov 7, 2018

I'll give it a shot, my MacBook is chugging on the build now.

@danieldk
Copy link
Contributor

danieldk commented Nov 7, 2018

@symphorien
Copy link
Member Author

Can you try to remove the stdenv.isLinux && guard on line 45 then ? This should solve ___cxa_end_catch at least and a bunch of others.

@danieldk
Copy link
Contributor

danieldk commented Nov 7, 2018

I have added

++ optional (stdenv.isDarwin && !withBundledLLVM) "-lc++"

Since the C++ standard library is libc++ on Darwin and the --push-state and --as-needed options are not known by LLVM ld (?). Now linking fails on missing jemalloc symbols, which is interesting, since configure reports that it finds jemalloc.

@symphorien
Copy link
Member Author

Yes --push-state is a GNU extension, I forgot.

@danieldk
Copy link
Contributor

danieldk commented Nov 7, 2018

Does the removed comment about jemalloc prefixes from postPatch still apply?

$ nm /nix/store/42lsjkifndxxzn4i8c3iyl0rijrsan0v-jemalloc-5.1.0/lib/libjemalloc.dylib  | grep calloc
00000000000045a0 T _calloc
00000000000023b0 T _je_bootstrap_calloc
000000000004a4b0 t _zone_calloc

@symphorien
Copy link
Member Author

Ah maybe, I removed it because it was not needed on linux.

@danieldk
Copy link
Contributor

danieldk commented Nov 7, 2018

That was indeed it, it finished the stage0 compiler, will keep you posted.

@danieldk
Copy link
Contributor

danieldk commented Nov 7, 2018

Here is the modified diff that compiles on Darwin:

https://gist.github.com/danieldk/127bf63c7ebcff0d7b0e9c21d482dd04

Changes:

  • Add linker flag -lc++ to NIX_LDFLAGS on Darwin.
  • In postPatch, remove je_ prefix in jemalloc symbol names on Darwin.

Tested:

  • nix-index derivation: succeeds
  • fd derivation: fails in unit tests with:
    error: process didn't exit successfully: `/private/tmp/nix-build-fd-7.2.0.drv-0/source/target/debug/deps/tests-090b3e427c9f00ad` (signal: 11, SIGSEGV: invalid memory reference)
    
  • Some of my own crates (so far, so good).

This seems to solve the failing doctest problem.

@danieldk
Copy link
Contributor

danieldk commented Nov 7, 2018

The segmentation fault happens in free():

* thread #2, name = 'test_absolute_path', stop reason = EXC_BAD_ACCESS (code=1, address=0x190178c63c)
  * frame #0: 0x00000001000c6b0b tests-090b3e427c9f00ad`free + 347

Full stack trace: https://gist.github.com/danieldk/5ee20615aa7ad61cba2a460ea5d08ac7

@Mic92
Copy link
Member

Mic92 commented Nov 7, 2018

So it crashes in jemalloc for some reason. What I am surprised about is that I do not see any jemalloc dependencies in the homebrew build: https://github.com/Homebrew/homebrew-core/blob/76dbba870f4e74d76392d992d96bd044daf435a9/Formula/rust.rb

Their jemalloc also comes without a prefix: https://github.com/Homebrew/homebrew-core/blob/master/Formula/jemalloc.rb#L26

Do they use the system allocator on macos? I don't see any vendoring done by rust regarding jemalloc.

@alyssais do you know if they install a jemalloc by default in homebrew?

@symphorien
Copy link
Member Author

There is a bundled jemalloc, which I remove to reduce compilation time. This was commented out before the PR. So maybe we should stick to the bundled jemalloc on darwin for the two or so upcoming rustc versions.
Afterwards, the build system will change wrt jemalloc because of rust-lang/rust#55238 . At this point we may reevaluate unbundling jemalloc on darwin.

@danieldk
Copy link
Contributor

danieldk commented Nov 7, 2018

@Mic92 I think Rust imports jemalloc itself through jemalloc_sys.

I am now rebuilding jemalloc and rustc with the je_ prefixes to see if that helps (I guess there are symbol clashes between C [mc]alloc/free and jemalloc's equivalents without the prefix).

@danieldk
Copy link
Contributor

danieldk commented Nov 7, 2018

The fd unit tests pass without any segfaults when jemalloc is compiled with the je_ prefix for jemalloc functions:

diff --git a/pkgs/development/libraries/jemalloc/common.nix b/pkgs/development/libraries/jemalloc/common.nix
index 593f4411f19..6cff824061e 100644
--- a/pkgs/development/libraries/jemalloc/common.nix
+++ b/pkgs/development/libraries/jemalloc/common.nix
@@ -12,7 +12,7 @@ stdenv.mkDerivation (rec {
   # By default, jemalloc puts a je_ prefix onto all its symbols on OSX, which
   # then stops downstream builds (mariadb in particular) from detecting it. This
   # option should remove the prefix and give us a working jemalloc.
-  configureFlags = stdenv.lib.optional stdenv.isDarwin "--with-jemalloc-prefix=";
+  #configureFlags = stdenv.lib.optional stdenv.isDarwin "--with-jemalloc-prefix=";
   doCheck = true;
 
   enableParallelBuilding = true;

From the jemalloc documentation:

By default, the prefix is "", except on OS X, where it is "je_". On OS X,
jemalloc overlays the default malloc zone, but makes no attempt to actually
replace the "malloc", "calloc", etc. symbols.

Just like fedora does: https://src.fedoraproject.org/rpms/rust/blob/master/f/rust.spec

Also some cleanup of tests which were removed but no longer exist.
@GrahamcOfBorg GrahamcOfBorg added the 2.status: merge conflict This PR has merge conflicts with the target branch label Nov 8, 2018
jemalloc with stripped prefix would cause segfaults in free:
NixOS#49557 (comment)

Thanks to @danieldk for darwin testing/debugging.
@symphorien
Copy link
Member Author

Thanks @danieldk !
I rebased on a newer staging and added a commit taking into account what you found:

  • add -lc++ to NIX_LDFLAGS
  • make jemalloc parametric to strip or not the je_ prefix and forcibly use the prefixed version on rustc

This new commit does not change hashes on linux.

@GrahamcOfBorg GrahamcOfBorg removed the 2.status: merge conflict This PR has merge conflicts with the target branch label Nov 8, 2018
@danieldk
Copy link
Contributor

danieldk commented Nov 8, 2018

Thanks! I'll do another test run.

@danieldk
Copy link
Contributor

danieldk commented Nov 8, 2018

Everything I tested before works well 👍 .

@andir
Copy link
Member

andir commented Nov 8, 2018

I am running a rebuild of this on x86_64-linux. It sounds really promising! :)

@alyssais
Copy link
Member

alyssais commented Nov 9, 2018

@alyssais do you know if they install a jemalloc by default in homebrew?

Homebrew doesn't install anything "by default". The only things that'll get a Homebrew jemalloc are formulae that have depends_on "jemalloc" (brew uses jemalloc).

I don't see anything Rust or LLVM-related.

@Mic92
Copy link
Member

Mic92 commented Nov 9, 2018

aarch64 build is scheduled.

@Mic92
Copy link
Member

Mic92 commented Nov 9, 2018

aarch64 build has been rescheduled (the community builder was rebooted)

@Mic92
Copy link
Member

Mic92 commented Nov 12, 2018

aarch64 successfully builded fd

@Mic92 Mic92 merged commit 3fd80c6 into NixOS:staging Nov 12, 2018
@periklis
Copy link
Contributor

@Mic92 can we enable tests for nix-index, alacritty and bat again?

@periklis
Copy link
Contributor

ic bat is enabled in #50236

@alyssais
Copy link
Member

can we enable tests for nix-index, alacritty and bat again?

Only in staging, presumably.

@symphorien symphorien mentioned this pull request Jan 20, 2019
10 tasks
andir pushed a commit to andir/nixpkgs that referenced this pull request Jan 23, 2019
jemalloc with stripped prefix would cause segfaults in free:
NixOS#49557 (comment)

This commit has been adjusted from the one on master to only carry the
relevant changes (the new optional `stripPrefix`) flag while still
keeping the other flags that were removed on master since they are no
loner supported by the version used there.

(cherry picked from commit 973eca7)
@symphorien symphorien deleted the rust-system-libs branch May 18, 2019 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants