Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bazel crashes on remote cache/ex error status during exec #7856

Closed
werkt opened this issue Mar 27, 2019 · 0 comments
Closed

Bazel crashes on remote cache/ex error status during exec #7856

werkt opened this issue Mar 27, 2019 · 0 comments
Labels
team-Remote-Exec Issues and PRs for the Execution (Remote) team untriaged

Comments

@werkt
Copy link
Contributor

werkt commented Mar 27, 2019

Description of the problem / feature request:

bazel crashes with an internal error when a grpc remote executor is used that responds with errors which are thrown as StatusRuntimeExceptions that reach the exec method layer of RemoteSpawnRunner.

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Start a remote executor that responds to bytestream downloads with DEADLINE_EXCEEDED after delivering an ActionResult from the ActionCache for an ActionKey. The error will be caught at the skyframe evaluator level and print a less-than-useful stacktrace:

Internal error thrown during build. Printing stack trace: java.lang.RuntimeException: Unrecoverable error while evaluating node 'ActionLookupData{actionLookupKey=@redact
ed//:REDACTED BuildConfigurationValue.Key[063aafdab8a2b820ba285eacafa0c65b] false, actionIndex=7}' (requested by nodes 'File:[[<execution_root>]REDACTED')
  at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:515)
  at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:368)
  at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
  at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
  at java.base/java.lang.Thread.run(Unknown Source)
Caused by: io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 59999979815ns
  at io.grpc.Status.asRuntimeException(Status.java:526)
  at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:419)
  at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:41)
  at io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:684)
  at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:41)
  at io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:391)
  at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:475)
  at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:63)
  at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:557)
  at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$600(ClientCallImpl.java:478)
  at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:590)
  at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
  at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
  ... 3 more

The intended behavior should be to a) fall back to local execution if configured, or b) print some usable error to trace back to the actual problem, but definitely c) not crash the build and certainly d) not crash the daemon.

What operating system are you running Bazel on?

Ubuntu 18.04

What's the output of bazel info release?

0.24.0

@jin jin added untriaged team-Remote-Exec Issues and PRs for the Execution (Remote) team labels Mar 27, 2019
werkt pushed a commit to werkt/bazel that referenced this issue Mar 27, 2019
Exceptions that occur during remote interactions are expected to be
wrapped in IOException for observation by the RemoteSpawn{Runner,Cache}
layers.

Fixes bazelbuild#7856
werkt pushed a commit to werkt/bazel that referenced this issue Mar 29, 2019
Exceptions that occur during remote interactions are expected to be
wrapped in IOException for observation by the RemoteSpawn{Runner,Cache}
layers.

Fixes bazelbuild#7856

Closes bazelbuild#7860.

PiperOrigin-RevId: 240793745
katre pushed a commit that referenced this issue Mar 29, 2019
Exceptions that occur during remote interactions are expected to be
wrapped in IOException for observation by the RemoteSpawn{Runner,Cache}
layers.

Fixes #7856

Closes #7860.

PiperOrigin-RevId: 240793745
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-Remote-Exec Issues and PRs for the Execution (Remote) team untriaged
Projects
None yet
Development

No branches or pull requests

2 participants