Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bazel build missing dep declaration with transitive dependency #15359

Open
yeounoh opened this issue Apr 27, 2022 · 18 comments
Open

Bazel build missing dep declaration with transitive dependency #15359

yeounoh opened this issue Apr 27, 2022 · 18 comments
Assignees
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Rules-CPP Issues for C++ rules type: bug

Comments

@yeounoh
Copy link

yeounoh commented Apr 27, 2022

Description of the bug:

I am building tensorflow project (commit: 2c6d3ed00f16838831aa460c5668a8466b9f3649) and running into errors about the missing dependency declarations.

For instance, here is one of the errors:

ERROR: /usr/local/google/home/yeounoh/.cache/bazel/_bazel_yeounoh/dc4cfe365eb5fc5c1bdcf9c9346373b9/external/llvm-project/mlir/BUILD.bazel:3073:11: Compiling mlir/lib/Support/IndentedOstream.cpp failed: undeclared inclusion(s) in rule '@llvm-project//mlir:Support':
this rule is missing dependency declarations for the following files included by 'mlir/lib/Support/IndentedOstream.cpp':
  'bazel-out/k8-opt-exec-50AE0418/bin/external/llvm-project/llvm/config.cppmap'
  'bazel-out/k8-opt-exec-50AE0418/bin/external/llvm-project/llvm/Demangle.cppmap'
  'bazel-out/k8-opt-exec-50AE0418/bin/external/llvm_terminfo/terminfo.cppmap'
  'bazel-out/k8-opt-exec-50AE0418/bin/external/llvm_zlib/zlib.cppmap'

And here is the corresponding build def (from the build cache):

# /usr/local/google/home/yeounoh/.cache/bazel/_bazel_yeounoh/dc4cfe365eb5fc5c1bdcf9c9346373b9/external/llvm-project/mlir/BUILD.bazel:3073:11
cc_library(
  name = "Support",
  deps = ["@llvm-project//llvm:Support"],
  includes = ["include"],
  srcs = ["@llvm-project//mlir:lib/Support/DebugCounter.cpp", "@llvm-project//mlir:lib/Support/FileUtilities.cpp", "@llvm-project//mlir:lib/Support/IndentedOstream.cpp", "@llvm-project//mlir:lib/Support/InterfaceSupport.cpp", "@llvm-project//mlir:lib/Support/StorageUniquer.cpp", "@llvm-project//mlir:lib/Support/Timing.cpp", "@llvm-project//mlir:lib/Support/ToolUtilities.cpp", "@llvm-project//mlir:lib/Support/TypeID.cpp"],
  hdrs = ["@llvm-project//mlir:include/mlir/Support/DebugAction.h", "@llvm-project//mlir:include/mlir/Support/DebugCounter.h", "@llvm-project//mlir:include/mlir/Support/DebugStringHelper.h", "@llvm-project//mlir:include/mlir/Support/FileUtilities.h", "@llvm-project//mlir:include/mlir/Support/IndentedOstream.h", "@llvm-project//mlir:include/mlir/Support/InterfaceSupport.h", "@llvm-project//mlir:include/mlir/Support/LLVM.h", "@llvm-project//mlir:include/mlir/Support/LogicalResult.h", "@llvm-project//mlir:include/mlir/Support/MathExtras.h", "@llvm-project//mlir:include/mlir/Support/StorageUniquer.h", "@llvm-project//mlir:include/mlir/Support/ThreadLocalCache.h", "@llvm-project//mlir:include/mlir/Support/Timing.h", "@llvm-project//mlir:include/mlir/Support/ToolUtilities.h", "@llvm-project//mlir:include/mlir/Support/TypeID.h"],
)
# Rule Support instantiated at (most recent call last):
#   /usr/local/google/home/yeounoh/.cache/bazel/_bazel_yeounoh/dc4cfe365eb5fc5c1bdcf9c9346373b9/external/llvm-project/mlir/BUILD.bazel:3073:11 in <toplevel>

It doesn't include the missing depndencies, but just @llvm-project/llvm:Support; however, the build def of @llvm-project/llvm:Support does contain the missing dependency declarations (so it built successfully, too):

# /usr/local/google/home/yeounoh/.cache/bazel/_bazel_yeounoh/dc4cfe365eb5fc5c1bdcf9c9346373b9/external/llvm-project/llvm/BUILD.bazel:181:11
cc_library(
  name = "Support",
  deps = ["@llvm-project//llvm:config", "@llvm-project//llvm:Demangle", "@llvm_terminfo//:terminfo", "@llvm_zlib//:zlib"],
  ...
  ... (there is a long list of other build attributes)

If I manually add the missing deps directly to the @llvm-project/mlirSupport build def, then I can make it work (but it will run into other similar issues; repeat). I think there is something wrong with my setting that prevents transitive dependency.

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Checkout https://github.com/tensorflow/tensorflow.git, commit: 2c6d3ed00f16838831aa460c5668a8466b9f3649.

Try:
bazel build //tensorflow/tools/pip_package:build_pip_package

I asked my colleagues to try and some have and some don't have the issue.

Which operating system are you running Bazel on?

Debian GNU/Linux rodete, Linux 5.15.15-1rodete2-amd64, x86-64

What is the output of bazel info release?

INFO: Options provided by the client: Inherited 'common' options: --isatty=1 --terminal_columns=90 INFO: Reading rc options for 'info' from /usr/local/google/home/yeounoh/git/pytorch/xla/third_party/tensorflow/.bazelrc: Inherited 'common' options: --experimental_repo_remote_exec INFO: Reading rc options for 'info' from /usr/local/google/home/yeounoh/git/pytorch/xla/third_party/tensorflow/.bazelrc: Inherited 'build' options: --define framework_shared_object=true --define=use_fast_cpp_protos=true --define=allow_oversize_protos=true --spawn_strategy=standalone -c opt --announce_rc --define=grpc_no_ares=true --noincompatible_remove_legacy_whole_archive --enable_platform_specific_config --define=with_xla_support=true --config=short_logs --config=v2 --define=no_aws_support=true --define=no_hdfs_support=true --experimental_cc_shared_library INFO: Reading rc options for 'info' from /usr/local/google/home/yeounoh/git/pytorch/xla/third_party/tensorflow/.tf_configure.bazelrc: Inherited 'build' options: --action_env PYTHON_BIN_PATH=/usr/local/google/home/yeounoh/anaconda3/envs/torch-xla-1.11/bin/python3 --action_env PYTHON_LIB_PATH=/usr/local/google/home/yeounoh/anaconda3/envs/torch-xla-1.11/lib/python3.8/site-packages --python_path=/usr/local/google/home/yeounoh/anaconda3/envs/torch-xla-1.11/bin/python3 INFO: Reading rc options for 'info' from /usr/local/google/home/yeounoh/git/pytorch/xla/third_party/tensorflow/.bazelrc: Inherited 'build' options: --deleted_packages=tensorflow/compiler/mlir/tfrt,tensorflow/compiler/mlir/tfrt/benchmarks,tensorflow/compiler/mlir/tfrt/jit/python_binding,tensorflow/compiler/mlir/tfrt/jit/transforms,tensorflow/compiler/mlir/tfrt/python_tests,tensorflow/compiler/mlir/tfrt/tests,tensorflow/compiler/mlir/tfrt/tests/ir,tensorflow/compiler/mlir/tfrt/tests/analysis,tensorflow/compiler/mlir/tfrt/tests/jit,tensorflow/compiler/mlir/tfrt/tests/lhlo_to_tfrt,tensorflow/compiler/mlir/tfrt/tests/tf_to_corert,tensorflow/compiler/mlir/tfrt/tests/tf_to_tfrt_data,tensorflow/compiler/mlir/tfrt/tests/saved_model,tensorflow/compiler/mlir/tfrt/transforms/lhlo_gpu_to_tfrt_gpu,tensorflow/core/runtime_fallback,tensorflow/core/runtime_fallback/conversion,tensorflow/core/runtime_fallback/kernel,tensorflow/core/runtime_fallback/opdefs,tensorflow/core/runtime_fallback/runtime,tensorflow/core/runtime_fallback/util,tensorflow/core/tfrt/common,tensorflow/core/tfrt/eager,tensorflow/core/tfrt/eager/backends/cpu,tensorflow/core/tfrt/eager/backends/gpu,tensorflow/core/tfrt/eager/core_runtime,tensorflow/core/tfrt/eager/cpp_tests/core_runtime,tensorflow/core/tfrt/gpu,tensorflow/core/tfrt/run_handler_thread_pool,tensorflow/core/tfrt/runtime,tensorflow/core/tfrt/saved_model,tensorflow/core/tfrt/graph_executor,tensorflow/core/tfrt/saved_model/tests,tensorflow/core/tfrt/tpu,tensorflow/core/tfrt/utils INFO: Found applicable config definition build:short_logs in file /usr/local/google/home/yeounoh/git/pytorch/xla/third_party/tensorflow/.bazelrc: --output_filter=DONT_MATCH_ANYTHING INFO: Found applicable config definition build:v2 in file /usr/local/google/home/yeounoh/git/pytorch/xla/third_party/tensorflow/.bazelrc: --define=tf_api_version=2 --action_env=TF2_BEHAVIOR=1 INFO: Found applicable config definition build:linux in file /usr/local/google/home/yeounoh/git/pytorch/xla/third_party/tensorflow/.bazelrc: --copt=-w --host_copt=-w --define=PREFIX=/usr --define=LIBDIR=$(PREFIX)/lib --define=INCLUDEDIR=$(PREFIX)/include --define=PROTOBUF_INCLUDE_PATH=$(PREFIX)/include --cxxopt=-std=c++14 --host_cxxopt=-std=c++14 --config=dynamic_kernels --distinct_host_configuration=false --experimental_guard_against_concurrent_changes INFO: Found applicable config definition build:dynamic_kernels in file /usr/local/google/home/yeounoh/git/pytorch/xla/third_party/tensorflow/.bazelrc: --define=dynamic_loaded_kernels=true --copt=-DAUTOLOAD_DYNAMIC_KERNELS release 5.1.1

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

https://github.com/tensorflow/tensorflow.git
cb8193c5d2dd82b0a1ecaf78d37392cae8e05582
2c6d3ed00f16838831aa460c5668a8466b9f3649

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

@sgowroji sgowroji added untriaged team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website type: bug labels Apr 28, 2022
@meteorcloudy
Copy link
Member

@yeounoh Sorry, we don't have the capacity to help you debug this since we cannot reproduce it. (bazel build //tensorflow/tools/pip_package:build_pip_package works for me locally). Would bazel clean --expunge help you?

@yeounoh
Copy link
Author

yeounoh commented Apr 28, 2022

hi @meteorcloudy, thanks -- I've tried bazel clean --expunge but it's the same. One thing I noticed is that if I change the spawn_strategy to sandboxed it works on my local machine (but, it still fails in our CI/VM build). Could this be a useful hint?

@meteorcloudy
Copy link
Member

You probably should check where do files like 'bazel-out/k8-opt-exec-50AE0418/bin/external/llvm-project/llvm/config.cppmap' come from and use bazel cquery to check why are they not in the dependencies.

@bjacob
Copy link

bjacob commented May 11, 2022

I'm running into this too.

I have the following environment variables set: CC=/usr/bin/clang, CXX=/usr/bin/clang++.

Unsetting these two environment variables removed the issue.

@meteorcloudy
Copy link
Member

FYI @oquenchil Seems like Bazel has some C++ module issue when building with clang.

@meteorcloudy meteorcloudy added team-Rules-CPP Issues for C++ rules P2 We'll consider working on this in future. (Assignee optional) and removed under investigation team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website labels May 12, 2022
@yeounoh
Copy link
Author

yeounoh commented May 16, 2022

Using the nightly (pre-release) resolved the issue for me.

@yeounoh
Copy link
Author

yeounoh commented May 17, 2022

I am seeing the failure again, will reopen the issue.

@bjacob it worked for me as well. Not sure why setting CC & CXX would cause the issue, though 🤷

@jpienaar
Copy link

At one point (#13135) --spawn_strategy=sandboxed was required due to zombie state hanging around (to paraphrase from there), potentially setting environment flags just resulted in avoiding some of that.

@adam-azarchs
Copy link
Contributor

adam-azarchs commented Jul 2, 2022

The issue seems to be that while the .cppmap files are actually included in the dependencies of the compile action (as seen via aquery), something else in there is upset about it. I think what's going on here is that the strict check is not recognizing those files as being headers.

@samuela
Copy link
Contributor

samuela commented Aug 25, 2022

This bug is also breaking the bazel build of jaxlib on x86_64-darwin with clang (NixOS/nixpkgs#183051 (comment)).

@uri-canva
Copy link
Contributor

Anyone have a small repro I can run through the debugger? Tensorflow is a bit big and I'm not very familiar with c++, so not sure I would be able to extract a small repro from it.

@lockmatrix
Copy link

add --spawn_strategy=sandboxed solved this problem for me

@yyyokata
Copy link

yyyokata commented Mar 9, 2023

According to my experience, clang version <=12 can avoid this issue but clang version >= 15 will reproduce it.
May this one can raise some hints.

Update:remove bazel feature layer_check also works.

@hypdeb
Copy link

hypdeb commented Aug 2, 2023

Same issue trying to depend on boost using https://github.com/nelhage/rules_boost and --spawn_strategy=sandboxed does not help.

@oquenchil
Copy link
Contributor

These look like undeclared inclusions thrown intentionally by the layering check. If you are affected you should either fix those errors by adding the required dependencies to the cc_library target or disable the layering check with --features=-layering_check passed on the command line. This is not a Bazel bug as far as I can tell.

Please feel free to reopen providing the exact compilation error, the Bazel build target as listed on the BUILD file and the contents of the *.cc file whose compilation is throwing the error. I'd expect that there is an #include header in the source file for which there isn't a direct dependency in the build target providing that header.

@hypdeb
Copy link

hypdeb commented Aug 4, 2023

I just faced the same issue with gtest and wrote a minimal repro: https://github.com/hypdeb/missing-deps.
Looking at the BUILD file in gtest we can see that the headers are in fact included: https://github.com/google/googletest/blob/455fcb7773dedc70ab489109fb12d8abc7fd59b6/BUILD.bazel#L86
and exist:
https://github.com/google/googletest/tree/main/googletest/include/gtest/internal

@oquenchil Removing layering check does not solve the issue.

@hypdeb
Copy link

hypdeb commented Aug 5, 2023

I ran a further experiment and building gtest itself with my toolchain fails. This means the issue I'm facing is a different one as it's not related to transitive dependencies. Please disregard my comments above.

If anyone ends up here with my issue anyways, it was solved by adding the following linker flags:

"-no-canonical-prefixes",
"-L/usr/local/llvm/lib",

to my toolchain.

@keith
Copy link
Member

keith commented Aug 2, 2024

Original issue here likely fixed by #21832, please verify with 7.3.0rc1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Rules-CPP Issues for C++ rules type: bug
Projects
None yet
Development

No branches or pull requests