Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ClientCall.cancel and ClientCallListener.onClose can be called multiple times causing RejectedExecutionException #7921

Closed
dapengzhang0 opened this issue Feb 25, 2021 · 5 comments
Assignees
Labels

Comments

@dapengzhang0
Copy link
Member

Example stacktrace

Caused by: java.util.concurrent.RejectedExecutionException
	at io.grpc.stub.ClientCalls$ThreadlessExecutor.execute(ClientCalls.java:783)
	at io.grpc.internal.SerializingExecutor.schedule(SerializingExecutor.java:93)
	at io.grpc.internal.SerializingExecutor.execute(SerializingExecutor.java:86)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.closedInternal(ClientCallImpl.java:747)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.closed(ClientCallImpl.java:683)
	at io.grpc.internal.ForwardingClientStreamListener.closed(ForwardingClientStreamListener.java:39)
	at io.grpc.internal.InternalSubchannel$CallTracingTransport$1$1.closed(InternalSubchannel.java:697)
	at io.grpc.internal.AbstractClientStream$TransportState.closeListener(AbstractClientStream.java:459)
	at io.grpc.internal.AbstractClientStream$TransportState.access$400(AbstractClientStream.java:221)
	at io.grpc.internal.AbstractClientStream$TransportState$1.run(AbstractClientStream.java:442)
	at io.grpc.internal.AbstractClientStream$TransportState.deframerClosed(AbstractClientStream.java:278)
	at io.grpc.internal.Http2ClientStreamTransportState.deframerClosed(Http2ClientStreamTransportState.java:31)
	at io.grpc.netty.NettyClientStream$TransportState.deframerClosed(NettyClientStream.java:223)
	at io.grpc.internal.MessageDeframer.close(MessageDeframer.java:233)
	at io.grpc.internal.MessageDeframer.closeWhenComplete(MessageDeframer.java:191)
	at io.grpc.internal.AbstractStream$TransportState.closeDeframer(AbstractStream.java:203)
	at io.grpc.internal.AbstractClientStream$TransportState.transportReportStatus(AbstractClientStream.java:445)
	at io.grpc.internal.AbstractClientStream$TransportState.transportReportStatus(AbstractClientStream.java:401)
	at io.grpc.internal.AbstractClientStream$TransportState.inboundTrailersReceived(AbstractClientStream.java:384)
	at io.grpc.internal.Http2ClientStreamTransportState.transportTrailersReceived(Http2ClientStreamTransportState.java:183)
	at io.grpc.netty.NettyClientStream$TransportState.transportHeadersReceived(NettyClientStream.java:334)
	at io.grpc.netty.NettyClientHandler.onHeadersRead(NettyClientHandler.java:376)
	at io.grpc.netty.NettyClientHandler.access$1200(NettyClientHandler.java:91)
	at io.grpc.netty.NettyClientHandler$FrameListener.onHeadersRead(NettyClientHandler.java:928)
	at io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder$FrameReadListener.onHeadersRead(DefaultHttp2ConnectionDecoder.java:373)
	at io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder$FrameReadListener.onHeadersRead(DefaultHttp2ConnectionDecoder.java:321)
	at io.netty.handler.codec.http2.Http2InboundFrameLogger$1.onHeadersRead(Http2InboundFrameLogger.java:56)
	at io.netty.handler.codec.http2.DefaultHttp2FrameReader$2.processFragment(DefaultHttp2FrameReader.java:483)
	at io.netty.handler.codec.http2.DefaultHttp2FrameReader.readHeadersFrame(DefaultHttp2FrameReader.java:491)
	at io.netty.handler.codec.http2.DefaultHttp2FrameReader.processPayloadState(DefaultHttp2FrameReader.java:254)
	at io.netty.handler.codec.http2.DefaultHttp2FrameReader.readFrame(DefaultHttp2FrameReader.java:160)
	at io.netty.handler.codec.http2.Http2InboundFrameLogger.readFrame(Http2InboundFrameLogger.java:41)
	at io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder.decodeFrame(DefaultHttp2ConnectionDecoder.java:174)
	at io.netty.handler.codec.http2.Http2ConnectionHandler$FrameDecoder.decode(Http2ConnectionHandler.java:378)
	at io.netty.handler.codec.http2.Http2ConnectionHandler.decode(Http2ConnectionHandler.java:438)
	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:501)
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:440)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
	at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795)
	at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:475)
	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:835)
@ejona86 ejona86 added this to the Next milestone Feb 25, 2021
dapengzhang0 added a commit that referenced this issue Feb 25, 2021
…ter end of RPC" (#7920)

This reverts commit 61e0f30 (#7798).

Our stub/core implementation had a bug (#7921) that might make it possible to leak cancellation through to the executor multiple times, typically when a custom interceptor is used. Revert this commit because import of grpc to google internal fails. We think the bug found in the import is legit and in our code. So we revert this change to avoid hurting users until the underlying issue is fixed.
@dapengzhang0 dapengzhang0 self-assigned this Mar 29, 2021
dapengzhang0 added a commit that referenced this issue Jun 29, 2021
We used to have two ClientStreamListener.closed() methods. One is simply calling the other with default arg. This doubles debugging (e.g. #7921) and sometimes unit testing work. Deleting the 2-arg method to cleanup.

This PR is purely refactoring.
YifeiZhuang pushed a commit to YifeiZhuang/grpc-java that referenced this issue Jul 25, 2021
We used to have two ClientStreamListener.closed() methods. One is simply calling the other with default arg. This doubles debugging (e.g. grpc#7921) and sometimes unit testing work. Deleting the 2-arg method to cleanup.

This PR is purely refactoring.
@YifeiZhuang YifeiZhuang self-assigned this Jan 18, 2022
@YifeiZhuang
Copy link
Contributor

YifeiZhuang commented Feb 3, 2022

We can consistently reproduce failures from global presubmit, e.g. blaze test --test_output=all --test_arg=--verbosity=1 //javatests/com/google/frameworks/client/data/android/cache:CachingClientInterceptorTest --test_filter=CachingClientInterceptorTest.testValidCacheOnlyDirective

From this very test case, during ThreadlessExecutor shutdown, the application is still scheduling tasks on the executor:
https://source.corp.google.com/piper///depot/google3/java/com/google/frameworks/client/data/android/interceptor/AsyncInterceptorsClientCall.java;l=519-531
because of wrapping between AsyncInterceptorsClientCall and OrderVerifyingClientCall.

debug logs: https://paste.googleplex.com/4655899021934592

I think the original change #7798 is problematic, any asynchronous interceptor call of cancel can trigger the issue. During shutdown executor, it should not throw RejectedExecutionException but keep silent?
@ejona86

@ejona86
Copy link
Member

ejona86 commented Feb 4, 2022

Once ClientCall.Listener.onClose() has been called, we should not schedule anything more onto the executor. So I think the approach would be to plumb the clientCall.cancel() such that it avoids enqueuing on the executor if the call is already dead.

@YifeiZhuang
Copy link
Contributor

fixing google3 tests to synchronize executor on ClientCall.Listener.onClose(): cl/427613846

@YifeiZhuang
Copy link
Contributor

Redoing #7798 (#8847) seems good. Closing.
The stack trace in the description looks different from the reproduction google3 failures, won't fix until we see them again.

@dapengzhang0 dapengzhang0 removed this from the Next milestone Feb 18, 2022
@dapengzhang0
Copy link
Member Author

The stacktrace in the description is caused by wrong usage in the test in google3. Clarified in document by #8913.

Use #8928 to track a similar issue that should be a grpc bug, and it might be triggered when #7798 is merged.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 20, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants