Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: --remote_cache causes test rules for C++ and Java fail even on first run #2560

Closed
mwitkow opened this issue Feb 21, 2017 · 4 comments
Closed

Comments

@mwitkow
Copy link

mwitkow commented Feb 21, 2017

Please provide the following information. The more we know about your system and use case, the more easily and likely we can help.

Description of the problem

As part of implementing a proof of concept of a distributed cache for Bazel builds (see mwitkow/bazel-distcache) it seems that the remote_protocol.proto implementation inside bazel fails to report the correct output of test rules for both Java and C++.

Repro steps

Building and running localcache

localcache is a proof of concept implementation of the gRPC remote_cache for bazel.

Building (Go >=1.7)

mkdir gopath
cd gopath
export GOPATH=$(pwd)
go get github.com/mwitkow/bazel-distcache/cmd/localcache

Running (with local caches):

mkdir -p /tmp/localcache-blobstore /tmp/localcache-actionstore
bin/localcache

Java rule failures

[grpc-ecosystem/java-grpc-prometheus] is a simple Java project with bazel and tests (note the .bazelrc of the project uses legacy test runner and shows errors on console).

Checking out:

git clone git@github.com:grpc-ecosystem/java-grpc-prometheus.git
cd java-grpc-prometheus

Running the tests passes without remote_cache:

bazel --host_jvm_args=-Dbazel.DigestFunction=SHA1 test //...
zsh: correct '//...' to '//..' [nyae]? n
INFO: Found 18 targets and 1 test target...
INFO: Elapsed time: 18.674s, Critical Path: 14.51s
//src/test/java/me/dinowernli/grpc/prometheus/integration:tests          PASSED in 11.4s
Executed 1 out of 1 test: 1 test passes.

You need to bazel --host_jvm_args=-Dbazel.DigestFunction=SHA1 clean to avoid local test caching.

Running tests with remote cache fails, even though the logs of localcache show misses meaning that everything was done using a local execution:

bazel --host_jvm_args=-Dbazel.DigestFunction=SHA1 test  --spawn_strategy=remote --remote_cache=localhost:10101 //...
INFO: Found 18 targets and 1 test target...
FAIL: //src/test/java/me/dinowernli/grpc/prometheus/integration:tests (see /home/michal/.cache/bazel/_bazel_michal/ab1f581b155b0fe49c830d89ce783951/execroot/java-grpc-prometheus/bazel-out/local-fastbuild/testlogs/src/test/java/me/dinowernli/grpc/prometheus/integration/tests/test.log).
INFO: From Testing //src/test/java/me/dinowernli/grpc/prometheus/integration:tests:
==================== Test output for //src/test/java/me/dinowernli/grpc/prometheus/integration:tests:
JUnit version 4.10
<truncated>

Time: 10.628

OK (15 tests)

================================================================================
INFO: Elapsed time: 17.279s, Critical Path: 15.36s
//src/test/java/me/dinowernli/grpc/prometheus/integration:tests          FAILED in 1 out of 2 in 11.1s
  /home/michal/.cache/bazel/_bazel_michal/ab1f581b155b0fe49c830d89ce783951/execroot/java-grpc-prometheus/bazel-out/local-fastbuild/testlogs/src/test/java/me/dinowernli/grpc/prometheus/integration/tests/test.log

Executed 1 out of 1 test: 1 fails locally.

Please note that the JUnit output says OK, but the bazel rule output says FAILED

Re-running with cached results:

  • successfully fetches test results from the cache (localcache indicates hits, and no tests are JUnit-run)
  • fails for all subsequent calls
  • the stdout of tests stored in the cache is the same as running locally

C++ rule failures

We'll use the protobuf repo.

git clone git@github.com:google/protobuf.git
cd protobuf

Again the tests pass without the remote-cache parameter:

bazel --host_jvm_args=-Dbazel.DigestFunction=SHA1 test protobuf_test
INFO: Found 1 test target...
INFO: From Linking protobuf_test:
bazel-out/local-fastbuild/bin/_objs/protobuf_test/src/google/protobuf/testing/googletest.pic.o:googletest.cc:function google::protobuf::(anonymous namespace)::GetTemporaryDirectoryName(): warning: the use of `tmpnam' is dangerous, better use `mkstemp'
Target //:protobuf_test up-to-date:
  bazel-bin/protobuf_test
INFO: Elapsed time: 286.397s, Critical Path: 281.15s
//:protobuf_test                                                         PASSED in 22.5s

Executed 1 out of 1 test: 1 test passes.

You need to bazel --host_jvm_args=-Dbazel.DigestFunction=SHA1 clean to avoid local test caching.

Running tests with remote cache fails, even though the logs of localcache show misses meaning that everything was done using a local execution:

bazel --host_jvm_args=-Dbazel.DigestFunction=SHA1 test  --spawn_strategy=remote --remote_cache=localhost:10101 protobuf_test
INFO: Found 1 test target...
INFO: From Linking protobuf_test:
bazel-out/local-fastbuild/bin/_objs/protobuf_test/src/google/protobuf/testing/googletest.pic.o:googletest.cc:function google::protobuf::(anonymous namespace)::GetTemporaryDirectoryName(): warning: the use of `tmpnam' is dangerous, better use `mkstemp'
FAIL: //:protobuf_test (see /home/michal/.cache/bazel/_bazel_michal/32e3a2704ea1e3a80569712a587a1a3a/execroot/protobuf/bazel-out/local-fastbuild/testlogs/protobuf_test/test.log).
Target //:protobuf_test up-to-date:
  bazel-bin/protobuf_test
INFO: Elapsed time: 301.284s, Critical Path: 296.80s
//:protobuf_test                                                         FAILED in 1 out of 2 in 22.0s
  /home/michal/.cache/bazel/_bazel_michal/32e3a2704ea1e3a80569712a587a1a3a/execroot/protobuf/bazel-out/local-fastbuild/testlogs/protobuf_test/test.log

However:

tail -n 5 /home/michal/.cache/bazel/_bazel_michal/32e3a2704ea1e3a80569712a587a1a3a/execroot/protobuf/bazel-out/local-fastbuild/testlogs/protobuf_test/test.log
[----------] 10 tests from DifferentTypeInfoSourceTest/ProtoStreamObjectWriterOneOfsTest (2 ms total)

[----------] Global test environment tear-down
[==========] 1932 tests from 187 test cases ran. (21900 ms total)
[  PASSED  ] 1932 tests.

Re running the tests using the cache (even though they're successfully cached) still returns a failure.

Hypothesis for failure

It is worth noting that the localcache doesn't mangle the ActionResult at all, it returns it verbatim. The fact that the builds fail on first remote build, indicate that it is a problem with RemoteActionCache in general.

Maybe it's something to do with not obeying the return_code of ActionResult or maybe the stderr being present (but empty) mean that the tests fail.

Environment info

  • Operating System: Ubuntu 16.04

  • Bazel version (output of bazel info release):

bazel --host_jvm_args=-Dbazel.DigestFunction=SHA1 info release                                                                                                                    [11:11:49]
release 0.4.4

CC
@ola-rozenfeld - author of RemoteCache
@dinowernli - java-grpc-prometheus author and my go-to bazel dude.

@ola-rozenfeld
Copy link
Contributor

Thank you for the report, I will look into it!

@hhclam
Copy link
Contributor

hhclam commented Feb 21, 2017

Maybe related to this: #1413

@ola-rozenfeld
Copy link
Contributor

Oh, thank you, yes, this seems to be a duplicate of #1413.

@damienmg
Copy link
Contributor

damienmg commented Mar 7, 2017

Closing as duplicate

@damienmg damienmg closed this as completed Mar 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants