Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flakey Test: ITElasticsearchHealthCheck.oneHealthy #3197

Closed
codefromthecrypt opened this issue Sep 4, 2020 · 7 comments · Fixed by #3236 or #3257
Closed

Flakey Test: ITElasticsearchHealthCheck.oneHealthy #3197

codefromthecrypt opened this issue Sep 4, 2020 · 7 comments · Fixed by #3236 or #3257

Comments

@codefromthecrypt
Copy link
Member

[ERROR] zipkin2.server.internal.elasticsearch.ITElasticsearchHealthCheck.oneHealthy  Time elapsed: 0.078 s  <<< FAILURE!

org.opentest4j.AssertionFailedError: 

Expecting:

 <false>

to be equal to:

 <true>

but was not.

	at zipkin2.server.internal.elasticsearch.ITElasticsearchHealthCheck.oneHealthy(ITElasticsearchHealthCheck.java:106)
@codefromthecrypt
Copy link
Member Author

I added info to the assertion on fail. there are similar flakes recently in actual ES tests, with an AbortedStreamException also. I didn't notice these until 0.99.9 or 1.0.0 I forget... either these will go away magically or we will have to perhaps use tracing inside the IT to print a trace when there was an exception and/or use the -Dcom.linecorp.armeria.verboseExceptions=always or https://javadoc.io/doc/com.linecorp.armeria/armeria-javadoc/latest/com/linecorp/armeria/common/util/Sampler.html#of(java.lang.String)

thx for hints @trustin

@codefromthecrypt
Copy link
Member Author

Will apply the things as master build is erring

[ERROR] zipkin2.elasticsearch.integration.ITElasticsearchStorage$ITServiceAndSpanNames.getRemoteServiceNames  Time elapsed: 0.035 s  <<< ERROR!

com.linecorp.armeria.common.stream.AbortedStreamException

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Sep 18, 2020

hmm even after changing to use system property, there's no causal chain here:

java.lang.AssertionError: Health check failed with message: null

	at zipkin2.server.internal.elasticsearch.ITElasticsearchHealthCheck.assertOk(ITElasticsearchHealthCheck.java:203)

	at zipkin2.server.internal.elasticsearch.ITElasticsearchHealthCheck.healthyThenNotHealthyThenHealthy(ITElasticsearchHealthCheck.java:141)

Caused by: com.linecorp.armeria.common.stream.AbortedStreamException

https://travis-ci.org/github/openzipkin/zipkin/builds/728217084

cc @ikhoon @minwoox @trustin

@codefromthecrypt
Copy link
Member Author

@minwoox ps if you have time, our elasticsearch tests fail on almost every PR since something around 0.99 It bounces between a couple tests on AbortedStreamException. If you have any cycles or know someone to help with this, it would be nice to have green builds again

@minwoox
Copy link
Contributor

minwoox commented Oct 9, 2020

@adriancole I have looked through the code, but I couldn't find any clue. 😅
@trustin How about making the AbortedStreamException has the stacktrace back?

@codefromthecrypt
Copy link
Member Author

when setting this finally got more details..

      <plugin>
        <artifactId>maven-surefire-plugin</artifactId>
        <version>${maven-failsafe-plugin.version}</version>
        <configuration>
          <!-- Ensures root cause ends up in the console -->
          <trimStackTrace>false</trimStackTrace>
        </configuration>
      </plugin>
[ERROR] zipkin2.server.internal.elasticsearch.ITElasticsearchHealthCheck.allHealthy  Time elapsed: 0.139 s  <<< FAILURE!

java.lang.AssertionError: Health check failed with message: null

	at zipkin2.server.internal.elasticsearch.ITElasticsearchHealthCheck.assertOk(ITElasticsearchHealthCheck.java:203)

	at zipkin2.server.internal.elasticsearch.ITElasticsearchHealthCheck.allHealthy(ITElasticsearchHealthCheck.java:101)

	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

	at java.base/java.lang.reflect.Method.invoke(Method.java:566)

	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)

	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)

	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)

	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)

	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)

	at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)

	at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)

	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)

	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)

	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)

	at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)

	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)

	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)

	at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)

	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)

	at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)

	at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)

	at org.junit.rules.RunRules.evaluate(RunRules.java:20)

	at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)

	at org.junit.runners.ParentRunner.run(ParentRunner.java:413)

	at org.junit.runner.JUnitCore.run(JUnitCore.java:137)

	at org.junit.runner.JUnitCore.run(JUnitCore.java:115)

	at org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43)

	at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)

	at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)

	at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)

	at java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)

	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)

	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)

	at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)

	at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)

	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)

	at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)

	at org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:82)

	at org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:73)

	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108)

	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88)

	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54)

	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67)

	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52)

	at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96)

	at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:75)

	at org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.execute(JUnitPlatformProvider.java:188)

	at org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.invokeAllTests(JUnitPlatformProvider.java:154)

	at org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.invoke(JUnitPlatformProvider.java:128)

	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:428)

	at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:162)

	at org.apache.maven.surefire.booter.ForkedBooter.run(ForkedBooter.java:562)

	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:548)

Caused by: com.linecorp.armeria.common.stream.AbortedStreamException

	at com.linecorp.armeria.common.stream.AbortedStreamException.get(AbortedStreamException.java:39)

	at com.linecorp.armeria.common.stream.DefaultStreamMessage.abort(DefaultStreamMessage.java:154)

	at com.linecorp.armeria.common.HeaderOverridingHttpRequest.abort(HeaderOverridingHttpRequest.java:99)

	at com.linecorp.armeria.client.HttpResponseDecoder$HttpResponseWrapper.close(HttpResponseDecoder.java:299)

	at com.linecorp.armeria.client.HttpResponseDecoder$HttpResponseWrapper.close(HttpResponseDecoder.java:285)

	at com.linecorp.armeria.client.Http2ResponseDecoder.onDataRead(Http2ResponseDecoder.java:259)

	at io.netty.handler.codec.http2.Http2FrameListenerDecorator.onDataRead(Http2FrameListenerDecorator.java:36)

	at io.netty.handler.codec.http2.Http2EmptyDataFrameListener.onDataRead(Http2EmptyDataFrameListener.java:49)

	at io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder$FrameReadListener.onDataRead(DefaultHttp2ConnectionDecoder.java:292)

	at io.netty.handler.codec.http2.DefaultHttp2FrameReader.readDataFrame(DefaultHttp2FrameReader.java:422)

	at io.netty.handler.codec.http2.DefaultHttp2FrameReader.processPayloadState(DefaultHttp2FrameReader.java:251)

	at io.netty.handler.codec.http2.DefaultHttp2FrameReader.readFrame(DefaultHttp2FrameReader.java:160)

	at io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder.decodeFrame(DefaultHttp2ConnectionDecoder.java:174)

	at io.netty.handler.codec.http2.DecoratingHttp2ConnectionDecoder.decodeFrame(DecoratingHttp2ConnectionDecoder.java:63)

	at io.netty.handler.codec.http2.Http2ConnectionHandler$FrameDecoder.decode(Http2ConnectionHandler.java:378)

	at io.netty.handler.codec.http2.Http2ConnectionHandler.decode(Http2ConnectionHandler.java:438)

	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:501)

	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:440)

	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)

	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)

	at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:271)

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)

	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)

	at io.netty.handler.flush.FlushConsolidationHandler.channelRead(FlushConsolidationHandler.java:152)

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)

	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)

	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)

	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)

	at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795)

	at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:475)

	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378)

	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)

	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)

	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)

	at java.base/java.lang.Thread.run(Thread.java:834)

@minwoox
Copy link
Contributor

minwoox commented Oct 15, 2020

Oops, the response is sent before the request is fully sent. 😅
Let me change to send the response after the service gets the request fully. 😅

minwoox added a commit to minwoox/zipkin that referenced this issue Oct 15, 2020
The `HttpResponse` was sent before the request are fully sent.
So the request was aborted after getting the response.

We should change to send the response after the request is fully received.
- close openzipkin#3197
minwoox added a commit to minwoox/zipkin that referenced this issue Oct 22, 2020
This will also fix the flaky test in ITElasticsearchHealthCheck hopefully.

- close openzipkin#3197
codefromthecrypt pushed a commit that referenced this issue Oct 22, 2020
This will also fix the flaky test in ITElasticsearchHealthCheck hopefully.

- close #3197
minwoox added a commit to minwoox/armeria that referenced this issue Nov 11, 2020
Motivation:
`AbortedStreamException.get()` returns the singleton or new instance depending on the `Flags.verboseExceptionSampler()`'s decision.
However, if there's a problem in certain circumstances, it is very hard to find the cause with the singleton.
Especially, this one. openzipkin/zipkin#3197 (comment)

Modification:
- Changed to created `AbortedStreamException` instances whenvere `get()` is called.

Result:
- You now get the stack trace of `AbortedStreamExcepetion` everytime.
codefromthecrypt pushed a commit that referenced this issue Dec 4, 2023
The `HttpResponse` was sent before the request are fully sent.
So the request was aborted after getting the response.

We should change to send the response after the request is fully received.
- close #3197

Signed-off-by: Adrian Cole <adrian@tetrate.io>
codefromthecrypt pushed a commit that referenced this issue Dec 4, 2023
…3236)

The `HttpResponse` was sent before the request are fully sent.
So the request was aborted after getting the response.

We should change to send the response after the request is fully received.
- close #3197

Signed-off-by: Adrian Cole <adrian@tetrate.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants