llvm: build with PGO #79454

carlocab · 2021-06-16T16:34:55Z

Have you followed the guidelines for contributing?
Have you ensured that your commits follow the commit style guide?
Have you checked that there aren't other open pull requests for the same formula update/change?
Have you built your formula locally with brew install --build-from-source <formula>, where <formula> is the name of the formula you're submitting?
Is your test running fine brew test <formula>, where <formula> is the name of the formula you're submitting?
Does your build pass brew audit --strict <formula> (after doing brew install --build-from-source <formula>)? If this is a new formula, does it pass brew audit --new <formula>?

This builds LLVM with profile-guided optimisations. I've adapted the
build from upstream documentation. [1, 2]

This does mean that builds will now take 2-3x longer than before. LLVM
currently takes about four hours to build at most (including recursive
dependents), so the additional time isn't as bad as some other formulae
that require long CI runs. Moreover, the collective time this saves
users of the LLVM formula should make the additional build time worth
it.

Enabling LTO is another potential optimisation that I haven't enabled
here. This appears to be enabled in Apple's default build. [3] One
reason enabling this is complicated is that enabling LTO generates
bitcode instead of object files, which libtool will refuse to archive
into a static library. It might be possible to build all the static
libraries first and then build optimised binaries later with LTO, but
I'll leave that for another PR (if at all).

See #77975.

[1] https://llvm.org/docs/HowToBuildWithPGO.html#building-clang-with-pgo
[2] https://github.com/llvm/llvm-project/blob/33ba8bd2/llvm/utils/collect_and_build_with_pgo.py
[3] https://github.com/apple/llvm-project/blob/swift-5.4-RELEASE/clang/cmake/caches/Apple-stage2.cmake#L30

carlocab · 2021-06-16T19:09:30Z

So it looks like the build fails everywhere except where I tested it. Lovely. 😅

carlocab · 2021-06-24T23:13:55Z

Formula/llvm.rb

+      # Our just-built Clang needs a little help finding C++ headers
+      on_macos do
+        if MacOS.version <= :catalina && sdk
+          cxxflags << "-isystem#{MacOS::CLT::PKG_PATH}/usr/include/c++/v1"


This is needed because, on Catalina or earlier, the atomic and type_traits headers are not in the SDK, but at /Library/Developer/CommandLineTools/usr/include/c++/v1. We see this on Mojave:

Mojave:CommandLineTools brew$ pwd /Library/Developer/CommandLineTools Mojave:CommandLineTools brew$ find . -name 'atomic' -print ./usr/include/c++/v1/atomic Mojave:CommandLineTools brew$ find . -name 'type_traits' -print ./usr/include/c++/v1/experimental/type_traits ./usr/include/c++/v1/type_traits

On the other hand, on Big Sur:

❯ pwd /Library/Developer/CommandLineTools ❯ fd '^atomic$' usr/include/c++/v1/atomic SDKs/MacOSX11.3.sdk/usr/include/c++/v1/atomic ❯ fd '^type_traits$' SDKs/MacOSX11.3.sdk/usr/include/c++/v1/experimental/type_traits SDKs/MacOSX11.3.sdk/usr/include/c++/v1/type_traits usr/include/c++/v1/type_traits usr/include/c++/v1/experimental/type_traits

carlocab · 2021-06-25T01:17:14Z

Formula/llvm.rb

    ]
    runtimes = %w[
      compiler-rt
      libcxx
      libcxxabi
      libunwind
+      openmp


Couldn't get openmp to build on ARM while in the projects array. It doesn't seem like this should have negative consequences on macOS... We may want to keep it in the projects array on Linux.

Actually, never mind. It shouldn't matter on Linux either, as it gets built with Clang whether it's in projects or runtimes. It's just a difference of which Clang you use to build it: the throwaway one, or the one that eventually gets installed into the prefix.

I suspect the change is due to my limited choice of LLVM_TARGETS_TO_BUILD for bootstrapping since the build fails with a complaint about missing symbols for arm64 on ARM.

In particular, it looks like the openmp build needs to generate non-native code, so it doesn't quite work when you have LLVM_TARGETS_TO_BUILD=Native. This is why building it with the compiler we plan to install works.

carlocab · 2021-06-25T13:29:20Z

Formula/llvm.rb

+        instrumented_cflags = cflags + ["-Xclang -mllvm -Xclang -vp-counters-per-site=6"]
+        instrumented_cxxflags = cxxflags + ["-Xclang -mllvm -Xclang -vp-counters-per-site=6"]


Without this, the build produces many warnings of the form

LLVM Profile Warning: Unable to track new values: Running out of static counters. Consider using option -mllvm -vp-counters-per-site=<n> to allocate more value profile counters at compile time.

It is my understanding that when this occurs there is no benefit to generating additional profile data.

carlocab · 2021-06-26T20:04:30Z

An additional benchmark at building clang from

❯ git rev-parse HEAD
4506f614cb6983a16d117cf77a968608e66d7a5c

Using Apple Clang:

❯ cmake ../llvm -DCMAKE_C_COMPILER=/usr/bin/clang -DCMAKE_CXX_COMPILER=/usr/bin/clang++ -DLLVM_TARGETS_TO_BUILD=Native -DLLVM_ENABLE_PROJECTS=clang
[snip]
❯ time ( cmake --build . --target clang --parallel 5; )
[snip]
( cmake --build . --target clang --parallel 5; )  15590.55s user 922.92s system 301% cpu 1:31:18.66 total

Using bottled LLVM Clang:

❯ cmake ../llvm -DCMAKE_C_COMPILER=/Users/carlocab/homebrew/bin/clang -DCMAKE_CXX_COMPILER=/Users/carlocab/homebrew/bin/clang++ -DLLVM_TARGETS_TO_BUILD=Native -DLLVM_ENABLE_PROJECTS=clang
[snip]
❯ time ( cmake --build . --target clang --parallel 5; )
[snip]
( cmake --build . --target clang --parallel 5; )  19035.14s user 921.20s system 295% cpu 1:52:31.98 total

Using LLVM with PGO:

❯ cmake ../llvm -DCMAKE_C_COMPILER=$(brew --prefix llvm)/bin/clang -DCMAKE_CXX_COMPILER=$(brew --prefix llvm)/bin/clang++ -DLLVM_TARGETS_TO_BUILD=Native -DLLVM_ENABLE_PROJECTS=clang
[snip]
❯ time ( cmake --build . --target clang --parallel 5; )
[snip]
( cmake --build . --target clang --parallel 5; )  15366.74s user 886.66s system 292% cpu 1:32:37.65 total

The PGO build is still a bit slower than Apple's build, but I think that's because Apple enables LTO, which is a bit more challenging to do for our build.

Or, for Python3:

❯ git rev-parse HEAD
521ba8892ef367c45bf1647b04a726d3f553637c

❯ time ( CC=/usr/bin/clang ../configure && make ) # Apple Clang
( CC=/usr/bin/clang ../configure && make; )  177.68s user 41.80s system 89% cpu 4:06.42 total

❯ time ( CC=$(brew --prefix llvm)/bin/clang ../configure && make ) # PGOed LLVM
( CC=$(brew --prefix llvm)/bin/clang ../configure && make ; )  166.23s user 51.13s system 88% cpu 4:06.00 total

❯ time ( CC=/Users/carlocab/homebrew/bin/clang ../configure && make ) # LLVM bottle
( CC=/Users/carlocab/homebrew/bin/clang ../configure && make; )  195.33s user 47.49s system 90% cpu 4:27.00 total

This builds LLVM with profile-guided optimisations. I've adapted the build from upstream documentation. [1, 2] This does mean that builds will now take 2-3x longer than before. LLVM currently takes about four hours to build at most (including recursive dependents), so the additional time isn't as bad as some other formulae that require long CI runs. Moreover, the collective time this saves users of the LLVM formula should make the additional build time worth it. LTO is another potential optimisation that I haven't enabled here. This appears to be enabled in Apple's default build [3], but is a little complicated to implement for an LLVM distribution that includes static archives [4]. Closes Homebrew#77975 [1] https://llvm.org/docs/HowToBuildWithPGO.html#building-clang-with-pgo [2] https://github.com/llvm/llvm-project/blob/33ba8bd2/llvm/utils/collect_and_build_with_pgo.py [3] https://github.com/apple/llvm-project/blob/swift-5.4-RELEASE/clang/cmake/caches/Apple-stage2.cmake#L30 [4] https://llvm.org/docs/BuildingADistribution.html#options-for-optimizing-llvm

carlocab · 2021-07-05T18:30:56Z

All my testing shows that this leads to modest to significant performance gains, and should cause no regressions for any users. Merging this -- it should be easy to revert if we find that the added CI time is just too painful.

BrewTestBot · 2021-07-05T18:31:31Z

@carlocab has triggered a merge.

BrewTestBot added the python Python use is a significant feature of the PR or issue label Jun 16, 2021

carlocab force-pushed the llvm-pgo branch 2 times, most recently from 59b07d3 to a58c751 Compare June 17, 2021 13:00

carlocab mentioned this pull request Jun 17, 2021

building libomp for arm64 tpoechtrager/osxcross#267

Open

carlocab force-pushed the llvm-pgo branch 13 times, most recently from 5f8b54d to 5f696f3 Compare June 24, 2021 17:39

carlocab force-pushed the llvm-pgo branch 3 times, most recently from b437d4a to 6238d02 Compare June 24, 2021 20:38

carlocab commented Jun 24, 2021

View reviewed changes

carlocab commented Jun 25, 2021

View reviewed changes

carlocab force-pushed the llvm-pgo branch 2 times, most recently from 4b43252 to abb94f7 Compare June 28, 2021 00:56

carlocab mentioned this pull request Jun 30, 2021

root 6.24.02 #75166

Closed

carlocab force-pushed the llvm-pgo branch from abb94f7 to 6efc0b8 Compare July 4, 2021 02:06

carlocab added the in progress Stale bot should stay away label Jul 4, 2021

carlocab force-pushed the llvm-pgo branch 2 times, most recently from 49e71bb to e422ed8 Compare July 4, 2021 18:35

carlocab force-pushed the llvm-pgo branch from e422ed8 to 2166e91 Compare July 4, 2021 19:06

BrewTestBot closed this in d3e4515 Jul 5, 2021

github-actions bot added the outdated PR was locked due to age label Aug 5, 2021

github-actions bot locked as resolved and limited conversation to collaborators Aug 5, 2021

carlocab deleted the llvm-pgo branch September 17, 2022 15:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llvm: build with PGO #79454

llvm: build with PGO #79454

carlocab commented Jun 16, 2021

carlocab commented Jun 16, 2021

carlocab Jun 24, 2021

carlocab Jun 25, 2021 •

edited

Loading

carlocab Jun 25, 2021 •

edited

Loading

carlocab Jun 25, 2021 •

edited

Loading

carlocab Jun 25, 2021

carlocab commented Jun 26, 2021 •

edited

Loading

carlocab commented Jul 5, 2021

BrewTestBot commented Jul 5, 2021

		instrumented_cflags = cflags + ["-Xclang -mllvm -Xclang -vp-counters-per-site=6"]
		instrumented_cxxflags = cxxflags + ["-Xclang -mllvm -Xclang -vp-counters-per-site=6"]

llvm: build with PGO #79454

llvm: build with PGO #79454

Conversation

carlocab commented Jun 16, 2021

carlocab commented Jun 16, 2021

carlocab Jun 24, 2021

Choose a reason for hiding this comment

carlocab Jun 25, 2021 • edited Loading

Choose a reason for hiding this comment

carlocab Jun 25, 2021 • edited Loading

Choose a reason for hiding this comment

carlocab Jun 25, 2021 • edited Loading

Choose a reason for hiding this comment

carlocab Jun 25, 2021

Choose a reason for hiding this comment

carlocab commented Jun 26, 2021 • edited Loading

carlocab commented Jul 5, 2021

BrewTestBot commented Jul 5, 2021

carlocab Jun 25, 2021 •

edited

Loading

carlocab Jun 25, 2021 •

edited

Loading

carlocab Jun 25, 2021 •

edited

Loading

carlocab commented Jun 26, 2021 •

edited

Loading