Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Java Client] Avoid IllegalStateException in ClientCnx debug logs #12899

Merged

Conversation

michaeljmarshall
Copy link
Member

Motivation

When running with debug logging, the Java Client can fail when it receives a valid response from the server. The Here is the stack trace. Note that we were running a custom build of the project, so the line numbers do not necessarily line up exactly. The issue is obvious though. If you look at the logic following each of the modified debug logs, you'll notice branching logic based on whether or not some fields are set. This change simply adds guards to ensure that we do not try to get fields that are not set.

14:38:58.824 [pulsar-client-io-1-1] WARN  org.apache.pulsar.client.impl.ClientCnx - [pulsar-proxy.mypulsar.svc.cluster.local/10.100.138.0:6650] Got exception java.lang.IllegalStateException: Field 'partitions' is not set
     at org.apache.pulsar.common.api.proto.CommandPartitionedTopicMetadataResponse.getPartitions(CommandPartitionedTopicMetadataResponse.java:38)                                                                 
     at org.apache.pulsar.client.impl.ClientCnx.handlePartitionResponse(ClientCnx.java:569)                                                                                                                       
     at org.apache.pulsar.common.protocol.PulsarDecoder.channelRead(PulsarDecoder.java:129)                                                                                                                       
     at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)                                                                                                  
     at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)                                                                                                  
     at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)                                                                                                    
     at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324)                                                                                                                
     at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296)                                                                                                                    
     at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)                                                                                                  
     at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)                                                                                                  
     at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)                                                                                                    
     at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)                                                                                                         
     at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)                                                                                                  
     at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)                                                                                                  
     at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)                                                                                                                  
     at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795)                                                                                     
     at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:480)                                                                                                                               
     at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378)                                                                                                                                        
     at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)                                                                                                              
     at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)                                                                                                                                 
     at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)                                                                                                                  
     at java.base/java.lang.Thread.run(Thread.java:829)

Modifications

  • Update the ClientCnx class to check if fields are set before calling the associated get method for the field.

Verifying this change

This change is a trivial rework / code cleanup without any test coverage.

Does this pull request potentially affect one of the following parts:

This is not a breaking change.

Documentation

  • no-need-doc

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Nov 19, 2021
@michaeljmarshall
Copy link
Member Author

@eolivelli - PTAL

@lhotari lhotari requested a review from merlimat November 19, 2021 18:08
Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@michaeljmarshall
Copy link
Member Author

/pulsarbot run-failure-checks

@merlimat merlimat added this to the 2.10.0 milestone Nov 19, 2021
@merlimat merlimat added the type/bug The PR fixed a bug or issue reported a bug label Nov 19, 2021
Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

But I saw that problem without enabling DEBUG logging

@michaeljmarshall
Copy link
Member Author

/pulsarbot run-failure-checks

@michaeljmarshall
Copy link
Member Author

@eolivelli
Copy link
Contributor

I can add more debug on the debug.
Is it possible that isDebugEnabled returns 'true'?

I using the standard pulsar-client tool packaged in the docker image, with default configuration files

@lhotari
Copy link
Member

lhotari commented Nov 22, 2021

I can add more debug on the debug. Is it possible that isDebugEnabled returns 'true'?

I using the standard pulsar-client tool packaged in the docker image, with default configuration files

Yes, It is likely that isDebugEnabled returns true in the default configuration for bin/pulsar-client and bin/pulsar-perf.
I fixed this issue for the bin/pulsar script in PR #8908 .
It looks like the same problem exists for bin/pulsar-client and bin/pulsar-perf scripts. I can create a PR to fix that.

@lhotari
Copy link
Member

lhotari commented Nov 22, 2021

Is it possible that isDebugEnabled returns 'true'?
@eolivelli I created #12915 to address that issue.

@eolivelli eolivelli merged commit 32b697d into apache:master Nov 22, 2021
@eolivelli
Copy link
Contributor

thank you @lhotari !

@eolivelli
Copy link
Contributor

After porting @lhotari change to my local branch I saw the real error
So your explanation @michaeljmarshall is good !

thank you @michaeljmarshall and @lhotari

:26:53.328 [pulsar-client-io-1-1] WARN  org.apache.pulsar.client.impl.BinaryProtoLookupService - [persistent://public/default/PAGEVIEWS_JSON] failed to get Partitioned metadata : org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: io.netty.channel.AbstractChannel$AnnotatedConnectException: connect(..) failed: Cannot assign requested address: pulsar-broker.mypulsar.svc.cluster.local/10.99.138.52:6650
org.apache.pulsar.client.api.PulsarClientException$LookupException: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: io.netty.channel.AbstractChannel$AnnotatedConnectException: connect(..) failed: Cannot assign requested address: pulsar-broker.mypulsar.svc.cluster.local/10.99.138.52:6650
	at org.apache.pulsar.client.impl.ClientCnx.getPulsarClientException(ClientCnx.java:1096) ~[com.datastax.oss-pulsar-client-original-2.8.0.1.1.7-SNAPSHOT.jar:2.8.0.1.1.7-SNAPSHOT]
	at org.apache.pulsar.client.impl.ClientCnx.handlePartitionResponse(ClientCnx.java:589) [com.datastax.oss-pulsar-client-original-2.8.0.1.1.7-SNAPSHOT.jar:2.8.0.1.1.7-SNAPSHOT]
	at org.apache.pulsar.common.protocol.PulsarDecoder.channelRead(PulsarDecoder.java:129) [com.datastax.oss-pulsar-common-2.8.0.1.1.7-SNAPSHOT.jar:2.8.0.1.1.7-SNAPSHOT]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [io.netty-netty-transport-4.1.68.Final.jar:4.1.68.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [io.netty-netty-transport-4.1.68.Final.jar:4.1.68.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [io.netty-netty-transport-4.1.68.Final.jar:4.1.68.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324) [io.netty-netty-codec-4.1.68.Final.jar:4.1.68.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296) [io.netty-netty-codec-4.1.68.Final.jar:4.1.68.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [io.netty-netty-transport-4.1.68.Final.jar:4.1.68.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [io.netty-netty-transport-4.1.68.Final.jar:4.1.68.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [io.netty-netty-transport-4.1.68.Final.jar:4.1.68.Final]
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) [io.netty-netty-transport-4.1.68.Final.jar:4.1.68.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [io.netty-netty-transport-4.1.68.Final.jar:4.1.68.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [io.netty-netty-transport-4.1.68.Final.jar:4.1.68.Final]
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [io.netty-netty-transport-4.1.68.Final.jar:4.1.68.Final]
	at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795) [io.netty-netty-transport-native-epoll-4.1.68.Final.jar:4.1.68.Final]
	at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:480) [io.netty-netty-transport-native-epoll-4.1.68.Final.jar:4.1.68.Final]
	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) [io.netty-netty-transport-native-epoll-4.1.68.Final.jar:4.1.68.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) [io.netty-netty-common-4.1.68.Final.jar:4.1.68.Final]
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [io.netty-netty-common-4.1.68.Final.jar:4.1.68.Final]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.68.Final.jar:4.1.68.Final]
	at java.lang.Thread.run(Thread.java:829) [?:?]

@michaeljmarshall michaeljmarshall deleted the fix-client-cnx-debug-log-errors branch November 23, 2021 18:09
codelipenghui pushed a commit that referenced this pull request Nov 26, 2021
@codelipenghui codelipenghui added the cherry-picked/branch-2.8 Archived: 2.8 is end of life label Nov 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/client cherry-picked/branch-2.8 Archived: 2.8 is end of life cherry-picked/branch-2.9 Archived: 2.9 is end of life doc-not-needed Your PR changes do not impact docs release/2.8.3 release/2.9.2 type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants