Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CMake build gets stuck on ProGraML bazel step #566

Closed
ChrisCummins opened this issue Feb 8, 2022 · 9 comments · Fixed by #698 or #703
Closed

CMake build gets stuck on ProGraML bazel step #566

ChrisCummins opened this issue Feb 8, 2022 · 9 comments · Fixed by #698 or #703
Assignees
Labels
Bug Something isn't working Testing & Tooling Tests, tooling, and build systems

Comments

@ChrisCummins
Copy link
Contributor

🐛 Bug

I have encountered a strange error where the CMake build appears to get "stuck" after the bazel build step of ProGraML. The build appears to hang indefinitely. Cancelling and re-running the CMake build works fine.

To Reproduce

Steps to reproduce the behavior:

$ cmake ....... # lots of flags (see below)

# <snip>
[3/9] Performing build_programl step for 'programl'
Starting local Bazel server and connecting to it...
Loading:
Loading: 0 packages loaded
Analyzing: 12 targets (5 packages loaded, 0 targets configured)
Analyzing: 12 targets (35 packages loaded, 209 targets configured)
Analyzing: 12 targets (50 packages loaded, 3390 targets configured)
INFO: Analyzed 12 targets (50 packages loaded, 6035 targets configured).
INFO: Found 12 targets...
[3 / 157] [Prepa] BazelWorkspaceStatusAction stable-status.txt
[84 / 227] Compiling libs/exception/src/clone_current_exception_non_intrusive.cpp; 0s linux-sandbox ... (32 actions, 31 running)
[94 / 227] Compiling absl/base/internal/sysinfo.cc; 1s linux-sandbox ... (32 actions, 31 running)
[111 / 227] Compiling libs/regex/src/c_regex_traits.cpp; 2s linux-sandbox ... (32 actions, 31 running)
[115 / 227] Compiling libs/regex/src/c_regex_traits.cpp; 3s linux-sandbox ... (32 actions running)
[119 / 227] Compiling libs/regex/src/c_regex_traits.cpp; 5s linux-sandbox ... (32 actions running)
[140 / 237] Compiling libs/regex/src/c_regex_traits.cpp; 6s linux-sandbox ... (32 actions running)
[156 / 243] Compiling libs/regex/src/c_regex_traits.cpp; 8s linux-sandbox ... (32 actions, 31 running)
[174 / 250] Compiling libs/regex/src/cregex.cpp; 10s linux-sandbox ... (32 actions, 31 running)
[220 / 270] Compiling libs/regex/src/cregex.cpp; 12s linux-sandbox ... (32 actions, 31 running)
INFO: From Compiling labm8/cpp/string.cc:
external/labm8/labm8/cpp/string.cc:135:22: warning: implicit conversion from 'int' to 'char' changes value from 226 to -30 [-Wconstant-conversion]
        to.append(1, 226);
           ~~~~~~    ^~~
external/labm8/labm8/cpp/string.cc:136:22: warning: implicit conversion from 'int' to 'char' changes value from 130 to -126 [-Wconstant-conversion]
        to.append(1, 130);
           ~~~~~~    ^~~
external/labm8/labm8/cpp/string.cc:137:22: warning: implicit conversion from 'int' to 'char' changes value from 172 to -84 [-Wconstant-conversion]
        to.append(1, 172);
           ~~~~~~    ^~~
3 warnings generated.
[269 / 272] Compiling programl/ir/llvm/internal/text_encoder.cc; 5s linux-sandbox
INFO: Elapsed time: 24.341s, Critical Path: 14.47s
INFO: 146 processes: 1 internal, 145 linux-sandbox.
INFO: Build completed successfully, 146 total actions
INFO: Build completed successfully, 146 total actions

Cancel, re-run CMake and the build proceeds as normal:

$ cmake .... # same as before
# <snip>
[0/2] Performing build_programl step for 'programl'
Loading:
Loading: 0 packages loaded
Analyzing: 12 targets (0 packages loaded, 0 targets configured)
INFO: Analyzed 12 targets (0 packages loaded, 0 targets configured).
INFO: Found 12 targets...
[0 / 1] [Prepa] BazelWorkspaceStatusAction stable-status.txt
INFO: Elapsed time: 0.425s, Critical Path: 0.01s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
INFO: Build completed successfully, 1 total action
[2/2] Completed 'programl'
# ....

Environment

Please fill in this checklist:

  • CompilerGym: development (0.2.2)
  • How you installed CompilerGym (conda, pip, source): source
  • OS: Ubuntu 20.04
  • Python version: 3.8
  • Build command you used (if compiling from source): cmake -GNinja -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_COMPILER_LAUNCHER=ccache -DCMAKE_CXX_COMPILER_LAUNCHER=ccache -DCMAKE_EXE_LINKER_FLAGS_INIT="-fuse-ld=lld" -DCMAKE_MODULE_LINKER_FLAGS_INIT="-fuse-ld=lld" -DCMAKE_SHARED_LINKER_FLAGS_INIT="-fuse-ld=lld" -DPython3_FIND_VIRTUALENV=FIRST -DCMAKE_BUILD_WITH_INSTALL_RPATH=true -S . -B build
  • GCC/clang version (if compiling from source): clang 10.0.0
  • Bazel version (if compiling from source): 4.2.2
  • Versions of any other relevant libraries:
@ChrisCummins ChrisCummins added Bug Something isn't working Testing & Tooling Tests, tooling, and build systems labels Feb 8, 2022
@ChrisCummins
Copy link
Contributor Author

cc @sogartar have you run into this before?

@sogartar
Copy link

sogartar commented Feb 8, 2022

@ChrisCummins, I have noticed that sometimes if CMake need to do a reconfiguration before doing the build in the same execution it may get suck after the configuration. When you run the build again it would do only the build step, which points to that the config step has completed successfully. I have not figured out why this happens.

The strange thing here is that in your case it got stuck during configuration, when thirdparty stuff are getting built. Internally ProGraML is build in an proxy/wrapper CMake process. It is possible, that this same thing happens there. The difference there is that the configure command is invoked separately. In this use-case I have not seen CMake get stuck.

To debug this it is helpful to call CMake with the --trace-expand option to see where it gets stuck during configuration. Also add verbose flags to Ninja to see if it gets stuck during the build step. This should be done when executing the wrapper CMake processes

If we are unable to fix this we may resort to putting a generous timeout and retry.

@ChrisCummins
Copy link
Contributor Author

Thanks for the debugging tips! It is definitely getting stuck during the configuration stage. It's not a particularly large problem for now, and may be a total non-issue if we end up porting ProGraML's build to CMake to achieve #568.

Cheers,
Chris

@mostafaelhoushi
Copy link
Contributor

According to this, the frequency of hanging might reduce tenfold by upgrading to bazel 5.1.0:
bazelbuild/bazel#15094

@ChrisCummins
Copy link
Contributor Author

Maybe, but that bug doesn't look like its quite the same symptoms. In our case, bazel reports that everything has finished building.

Cheers,
Chris

@mostafaelhoushi
Copy link
Contributor

Continuing the discussion from the PR: #697 (comment)

Yes, I can easily reproduce the problem on my Linux VM. I am trying to get a more verbose log or debug CMake but without success.

I tried:

  • --log-level=VERBOSE
  • -DCMAKE_VERBOSE_MAKEFILE=ON
  • --debug-output
  • --trace
    but didn't really get much information.

@mostafaelhoushi
Copy link
Contributor

--log-level=VERBOSE does indicate that the hang seems to happen during (or after?) running this step:

cmake --build ~/cmake_build/external/programl

then I tried to look at the CMake generated files in ~/cmake_build/external/programl but I am not sure what to make out of them.

@mostafaelhoushi
Copy link
Contributor

In this PR #703, I removed the USES_TERMINAL TRUE option from the bazel build command in line 47, and linux-cmake-build job took 34 minutes!

externalproject_add_step(
programl
build_programl
ALWAYS TRUE
COMMAND
"${CMAKE_COMMAND}" -E env "CC=${CMAKE_C_COMPILER}"
"CXX=${CMAKE_CXX_COMPILER}" "${Bazel_EXECUTABLE}" build
--verbose_failures "--cxxopt=-std=c++${CMAKE_CXX_STANDARD}" --
//programl/graph:features //programl/graph:program_graph_builder
//programl/graph/format:node_link_graph //programl/ir/llvm:llvm-10
//programl/proto:programl_cc //programl/proto:programl
@labm8//labm8/cpp:logging @labm8//labm8/cpp:status
@labm8//labm8/cpp:status_macros @labm8//labm8/cpp:statusor
@labm8//labm8/cpp:string @labm8//labm8/cpp:stringpiece
DEPENDEES update
WORKING_DIRECTORY "${CMAKE_CURRENT_BINARY_DIR}/programl/src/programl"
USES_TERMINAL TRUE
)

@ChrisCummins
Copy link
Contributor Author

Ah amazing, well done!!

@ChrisCummins ChrisCummins self-assigned this Nov 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working Testing & Tooling Tests, tooling, and build systems
Projects
None yet
3 participants