Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel build support #329

Open
mattgodbolt opened this issue Oct 18, 2019 · 27 comments
Open

Parallel build support #329

mattgodbolt opened this issue Oct 18, 2019 · 27 comments

Comments

@mattgodbolt
Copy link

This might be a big ask; and may be a bazel core project issue; but when consuming larger projects that need to be configure/make'd (e.g. OpenSSL, postgres, hdf5...) it would be great to be able to use parallel builds.

Of course, pragmatically one might override make_commands to have make -j12 in it...but that has disastrous consequences with other parallel builds (not in the least also on machines with fewer than 12 cores...)

In the land of gumdrops and unicorns, one could imagine bazel being able to trick out make by passing it the same magic that $(MAKE) or +make does in a real makefile, along with a -j set from the --jobs setting. That way, the make process will play nicely with the bazel build.

I realise this is a big ask: but if one doesn't ask, then maybe it won't happen! Any ideas on improving this in general also welcomed!

Thanks all!

@mattgodbolt
Copy link
Author

In case I was being incoherent, the POSIX jobserver for make is described in more detail here: https://www.gnu.org/software/make/manual/html_node/POSIX-Jobserver.html#POSIX-Jobserver

@irengrig
Copy link
Contributor

irengrig commented Dec 3, 2019

Thank you for the feedback, I think it is worth doing.
The other problem is how to balance the number of Bazel's jobs with make's jobs (also, there may be several parallel rules_foreign_cc targets being processed.)

@HackAttack
Copy link

I’d argue that this is vital to make this project usable. For example, running the example on my machine and building OpenBLAS takes 6 minutes, but when I manually build it with make it takes 13 seconds!

@lawsonAGMT
Copy link

I will second this. We have a cmake project dependency that is significantly larger than any other component.

@rdelfin
Copy link

rdelfin commented Sep 25, 2020

I'm finding myself with a cmake project whose build time goes up by almost 4x when compared with how it would go with a manual build. As a stop gap, what would be needed to simply add a fixed parameter in the target that lets you specify how many processes you want to create for the job server?

@HackAttack
Copy link

@dhbaird
Copy link

dhbaird commented Oct 7, 2020

Not a solution yet: But if I could find a way to let cmake_external inherit file descriptors, then a wrapper Makefile would hypothetically get me to a usable workaround, where CPU over-commit would have a 1x upper bound:

build: ; +bazel @all//... --action_env="MAKEFLAGS=${MAKEFLAGS}"
(where likely MAKEFLAGS=" -j --jobserver-fds=3,4" and so fds 3 and 4 need to be inherited by cmake_external)

Better yet, as mentioned above, would be to have Bazel itself implement the Make jobserver. Maybe this could be inspiration:
https://github.com/olsner/jobclient "GNU make jobserver and client for e.g. shell scripts"

@dhbaird
Copy link

dhbaird commented Oct 7, 2020

Following up from my previous comment- came up with this really sketchy workaround. Can't say I'm proud of it. But it "works," and puts a reasonable bound on CPU over-subscription without blocking parallelism.

  1. Create a home for jobserver fifos to live:
mkdir $HOME/.jobserver

# Create jmake (to use later instead of make):
cat > ~/.jobserver/jmake << EOF
#!/bin/bash
# Note: must expand $HOME here because Bazel removes $HOME from environment:
RD=$HOME/.jobserver/rd
WR=$HOME/.jobserver/wr 
exec 3<\$RD
exec 4>\$WR
MAKEFLAGS=" -j --jobserver-fds=3,4" make "\$@"
EOF
chmod a+rx ~/.jobserver/jmake
  1. Update make_commands in your BUILD file:
cmake_external(
    ...
    make_commands = ["/path/to/home/.jobserver/jmake", "/path/to/home/.jobserver/jmake install"]
)
  1. Start a jobserver and run Bazel (adjust NPROCS to whatever bound you want):
( NPROCS=32 ;
  RD=$HOME/.jobserver/rd ;
  WR=$HOME/.jobserver/wr ;
  rm -f $RD $WR ;
  mkfifo $RD $WR ;
  echo "jobserver: ; +( cat >$RD <&3 & ( while true; do cat <$WR; sleep 0.1; done ) >&4 )" | \
      make -f - -j ${NPROCS} ) &

bazel build @all//... --sandbox_writable_path=$HOME/.jobserver

@jsharpe
Copy link
Member

jsharpe commented Oct 7, 2020

You could skip step 2 by creating a custom make toolchain for make that uses your make script.

@dhbaird
Copy link

dhbaird commented Oct 8, 2020

Implemented @jsharpe's feedback (thanks!) and cleaned things up a bit, and then I turned it into a whole project that I swear I wasn't planning to do just one day ago. There is a questionable dependency on the "master" (or "main") branch of rules_foreign_cc that I'm not sure how to reconcile. Nevertheless here it is,

https://github.com/dhbaird/rules_foreign_cc_jobserver

@slsyy
Copy link

slsyy commented Dec 9, 2020

Calling make with an option --load-average can be a good workaround e.g:

NUMBER_OF_CPUS= "$(grep -c processor /proc/cpuinfo)"
make -j "$NUMBER_OF_CPUS"  -l "$NUMBER_OF_CPUS"

@fuhailin
Copy link

fuhailin commented Mar 4, 2021

Calling make with an option --load-average can be a good workaround e.g:

NUMBER_OF_CPUS= "$(grep -c processor /proc/cpuinfo)"
make -j "$NUMBER_OF_CPUS"  -l "$NUMBER_OF_CPUS"

that sounds like a good concise solution, how should i integrate your solution with rules_foreign_cc build script?
Could you please an example?
And what about Macos?

@slsyy
Copy link

slsyy commented Mar 4, 2021

@fuhailin I don't know, /proc/cpuinfo was just a quick thought. You can specifiy flags in bazel command to limit cpu like --local_cpu_resources or --jobs to reduce number of concurrent tasks. I guess it would be best to extract these information from the bazel itself

In my current build I just use hardcoded values like:

make -j 12 -l 12

@UebelAndre
Copy link
Collaborator

You could also use something like

make(
    name = "make_lib",
    env = {
        "CLANG_WRAPPER": "$(execpath //make_simple/code:clang_wrapper.sh)",
    },
    lib_source = "//make_simple/code:srcs",
    make_commands = [
        "make -j `nproc`",
        "make install",
    ],
    static_libraries = ["liba.a"],
    tools_deps = ["//make_simple/code:clang_wrapper.sh"],
)

Which is a slight alteration to:
https://github.com/bazelbuild/rules_foreign_cc/blob/175b29c6f78cf3c78516836587c268f3d0690526/examples/make_simple/BUILD.bazel#L4-L12

@fuhailin
Copy link

You could also use something like

make(
    name = "make_lib",
    env = {
        "CLANG_WRAPPER": "$(execpath //make_simple/code:clang_wrapper.sh)",
    },
    lib_source = "//make_simple/code:srcs",
    make_commands = [
        "make -j `nproc`",
        "make install",
    ],
    static_libraries = ["liba.a"],
    tools_deps = ["//make_simple/code:clang_wrapper.sh"],
)

Which is a slight alteration to:
https://github.com/bazelbuild/rules_foreign_cc/blob/175b29c6f78cf3c78516836587c268f3d0690526/examples/make_simple/BUILD.bazel#L4-L12

overwrite make_commands for "make" function with jobs args works fine, how should i do that for "cmake" function? Could I pass jobs args with something like "build_args" or "generate_args" ATTRIBUTES?

@UebelAndre
Copy link
Collaborator

I've added https://github.com/bazelbuild/rules_foreign_cc/blob/main/examples/cmake_with_target/BUILD.bazel to demonstrate how the build_args attribute might be used. Though if you're looking for stronger support for parallelization, you should check out CMAKE_BUILD_PARALLEL_LEVEL as an env argument.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days. Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_foreign_cc!

@github-actions
Copy link

This issue was automatically closed because it went 30 days without a reply since it was labeled "Can Close?"

@jwnimmer-tri
Copy link

I think this is a relatively important feature for building CMake dependencies of any notable size. Can we keep the issue open as a feature request, please?

@UebelAndre UebelAndre reopened this Oct 15, 2021
jsharpe added a commit to jsharpe/rules_foreign_cc that referenced this issue Jan 1, 2022
This can be used by adding
`--extra_toolchains=@rules_foreign_cc//toolchains:built_parallel_make` to your bazel command line

Addresses some of bazel-contrib#329
@jsharpe
Copy link
Member

jsharpe commented Jan 1, 2022

I've added #848 as an example of how parallel make could be encapsulated in a toolchain. However this can't currently be merged due to #433 as the runfiles for the shell wrapper script aren't present with out a fix for this. I've added #849 as an outline of the fix that is needed but that PR is not ready to be merged so any help in fixing that issue would mean that we could provide a hermetic make toolchain wrapper that adds -j and -l automatically.

@piratf
Copy link

piratf commented Apr 13, 2022

You could also use something like

make(
    name = "make_lib",
    env = {
        "CLANG_WRAPPER": "$(execpath //make_simple/code:clang_wrapper.sh)",
    },
    lib_source = "//make_simple/code:srcs",
    make_commands = [
        "make -j `nproc`",
        "make install",
    ],
    static_libraries = ["liba.a"],
    tools_deps = ["//make_simple/code:clang_wrapper.sh"],
)

Which is a slight alteration to:

https://github.com/bazelbuild/rules_foreign_cc/blob/175b29c6f78cf3c78516836587c268f3d0690526/examples/make_simple/BUILD.bazel#L4-L12

The "make_commands" option was removed in this version
Alternatively, the following two option combinations can be used.
args = ["-j `nproc`"],
targets = ["debug"],

docs: https://bazelbuild.github.io/rules_foreign_cc/main/make.html#make-out_static_libs

jsharpe added a commit to jsharpe/rules_foreign_cc that referenced this issue Aug 3, 2022
This can be used by adding
`--extra_toolchains=@rules_foreign_cc//toolchains:built_make_parallel_toolchain` to your bazel command line

Addresses some of bazel-contrib#329
@github-actions
Copy link

This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days. Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_foreign_cc!

@github-actions
Copy link

This issue was automatically closed because it went 30 days without a reply since it was labeled "Can Close?"

@jwnimmer-tri
Copy link

I think this is a relatively important feature for building CMake dependencies of any notable size. Can we keep the issue open as a feature request, please?

@zhmc
Copy link

zhmc commented Dec 12, 2023

The env "CMAKE_BUILD_PARALLEL_LEVEL" for cmake can solve this problem.

cmake(
    name = "xgboost",
    cache_entries = {
        "BUILD_STATIC_LIB": "ON",
    },
    lib_source = "@xgboost//:all_srcs",
    env = {
        "CMAKE_BUILD_PARALLEL_LEVEL": "32",
    },
    out_lib_dir	= "lib64",
    out_static_libs = ["libxgboost.a", "libdmlc.a"],
)

@jsharpe
Copy link
Member

jsharpe commented Dec 13, 2023

The env "CMAKE_BUILD_PARALLEL_LEVEL" for cmake can solve this problem.

cmake(
    name = "xgboost",
    cache_entries = {
        "BUILD_STATIC_LIB": "ON",
    },
    lib_source = "@xgboost//:all_srcs",
    env = {
        "CMAKE_BUILD_PARALLEL_LEVEL": "32",
    },
    out_lib_dir	= "lib64",
    out_static_libs = ["libxgboost.a", "libdmlc.a"],
)

The issue with this approach is that the number of CPUs it was built with is included in the cache key for the artifacts. Using the make -j nprocs or using ninja is far better as it doesn't affect the cache key.

This also doesn't address the overall parallelism of the build - this only sets it per target; if you needed to build multiple such targets bazel can schedule these at the same time and over provision the worker. Also this isn't RBE friendly either; if your RBE worker is in a limited cgroup this could actually end up being slower / OOM due to the added resource usage.

atobiszei added a commit to openvinotoolkit/model_server that referenced this issue Feb 27, 2024
Additionally bump aws-sdk-cpp version to 1.11.279

Encountered issues with bazel. Non ASCII filenames don't work inside bazel. Workaround is to remove test file from aws-sdk-cpp.
bazelbuild/bazel#374

Building with rules_foreign cmake doesn't work with parallel builds. This is something we could optimize later, for now hardcode some low value:
bazel-contrib/rules_foreign_cc#329

On Redhat to properly build targets we now have to add:
--//:distro=redhat
By default distro is set to ubuntu. This is workaround for not bazel not being able to differentiate between the two. There is difference in aws-sdk-cpp static libraries location after build. Ideally we should find how to use select on aws-sdk-cpp repo BUILD file which would rely on main repo flag, but I didn't find way to support it there.

JIRA:CVS-130367
@mering
Copy link

mering commented Mar 2, 2024

Another approach could be to generate Bazel BUILD files in a repository rule and perform the build with Bazel directly. This would require writing a Bazel generator into CMake (alongside Make/Ninja).
I outlined this approach in #1178.

psakamoori pushed a commit to psakamoori/model_server that referenced this issue Apr 2, 2024
Additionally bump aws-sdk-cpp version to 1.11.279

Encountered issues with bazel. Non ASCII filenames don't work inside bazel. Workaround is to remove test file from aws-sdk-cpp.
bazelbuild/bazel#374

Building with rules_foreign cmake doesn't work with parallel builds. This is something we could optimize later, for now hardcode some low value:
bazel-contrib/rules_foreign_cc#329

On Redhat to properly build targets we now have to add:
--//:distro=redhat
By default distro is set to ubuntu. This is workaround for not bazel not being able to differentiate between the two. There is difference in aws-sdk-cpp static libraries location after build. Ideally we should find how to use select on aws-sdk-cpp repo BUILD file which would rely on main repo flag, but I didn't find way to support it there.

JIRA:CVS-130367
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests