Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] TransportServiceHandshakeTests testConnectToNodeLight failing #85156

Closed
pgomulka opened this issue Mar 21, 2022 · 4 comments
Closed

[CI] TransportServiceHandshakeTests testConnectToNodeLight failing #85156

pgomulka opened this issue Mar 21, 2022 · 4 comments
Assignees
Labels
:Distributed Coordination/Network Http and internode communication implementations Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. >test-failure Triaged test failures from CI

Comments

@pgomulka
Copy link
Contributor

Build scan:
https://gradle-enterprise.elastic.co/s/ry5q53b56iugw/tests/:server:test/org.elasticsearch.transport.TransportServiceHandshakeTests/testConnectToNodeLight

Reproduction line:
./gradlew ':server:test' --tests "org.elasticsearch.transport.TransportServiceHandshakeTests.testConnectToNodeLight" -Dtests.seed=F102EBFC20BCB1FB -Dtests.locale=et -Dtests.timezone=Africa/Mogadishu -Druntime.java=17

Applicable branches:
master

Reproduces locally?:
No

Failure history:
https://gradle-enterprise.elastic.co/scans/tests?tests.container=org.elasticsearch.transport.TransportServiceHandshakeTests&tests.test=testConnectToNodeLight

Failure excerpt:

com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=820, name=Thread-89, state=RUNNABLE, group=TGRP-TransportServiceHandshakeTests]

  at __randomizedtesting.SeedInfo.seed([F102EBFC20BCB1FB:4BFF542966FA3AD0]:0)

  Caused by: java.lang.AssertionError: cluster:monitor/state

    at __randomizedtesting.SeedInfo.seed([F102EBFC20BCB1FB]:0)
    at org.elasticsearch.transport.InboundAggregator.lambda$new$0(InboundAggregator.java:46)
    at org.elasticsearch.transport.InboundAggregator.initializeRequestState(InboundAggregator.java:197)
    at org.elasticsearch.transport.InboundAggregator.headerReceived(InboundAggregator.java:66)
    at org.elasticsearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:138)
    at org.elasticsearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:121)
    at org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:86)
    at org.elasticsearch.transport.netty4.Netty4MessageInboundHandler.channelRead(Netty4MessageInboundHandler.java:63)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
    at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:280)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:722)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:623)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:586)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496)
    at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at java.lang.Thread.run(Thread.java:833)

@pgomulka pgomulka added :Core/Infra/Transport API Transport client API >test-failure Triaged test failures from CI labels Mar 21, 2022
@elasticmachine elasticmachine added the Team:Core/Infra Meta label for core/infra team label Mar 21, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@DaveCTurner
Copy link
Contributor

Likely duplicates #85589 and I believe the problem is environmental. Relabelling this :Distributed/Network to collect them together.

@DaveCTurner DaveCTurner added :Distributed Coordination/Network Http and internode communication implementations and removed :Core/Infra/Transport API Transport client API Team:Core/Infra Meta label for core/infra team labels Apr 1, 2022
@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Apr 1, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@DaveCTurner DaveCTurner self-assigned this Apr 1, 2022
@DaveCTurner
Copy link
Contributor

This looks to be caused by different test workers using the same port ranges. #85777 will mostly mitigate this, and #85786 tracks work towards a more robust fix. I'm closing this as the mitigation is now in place, and there's nothing particularly special about the tests involved in this failure which needs further attention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Network Http and internode communication implementations Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. >test-failure Triaged test failures from CI
Projects
None yet
Development

No branches or pull requests

3 participants