Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UI Search stops working #3487

Open
jeff-lemos opened this issue Dec 5, 2022 · 4 comments
Open

UI Search stops working #3487

jeff-lemos opened this issue Dec 5, 2022 · 4 comments
Labels

Comments

@jeff-lemos
Copy link

Describe the Bug

I'm running Zipkin in a Kubernetes deployment and storing tracers in ELastic Search. All fine, but sometimes some miss behaviors happens, like:

  • Today, doing some basic researches to test query performance seems to make the application stops working. I literally did nothing, only researches, but it stopped working.
  • If I redeploy my deployment it looses the "link" between Zipkin and Elastic Search so I have to change the ES_INDEX variable to make it work again but it will create a new index from scratch.

Response error and UI error

image
image

The query

[query](curl -X GET "http://zipkin.url/zipkin/api/v2/traces?serviceName=SERVICE_NAME&spanName=%2Fv6%2Fblock-account%2F%3Cstring%3Auser_ting%3E%2Fconfirm-block&limit=10" -H "accept: application/json")

The log error

it's the same for all the cases and situations when it stops**

2022-12-05 22:19:17.942  WARN [/] 1 --- [orker-epoll-2-3] z.s.i.BodyIsExceptionMessage             : Unexpected error handling request.

com.linecorp.armeria.server.RequestTimeoutException: null
        at com.linecorp.armeria.server.RequestTimeoutException.get(RequestTimeoutException.java:36) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.internal.common.CancellationScheduler.invokeTask(CancellationScheduler.java:467) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.internal.common.CancellationScheduler.lambda$setTimeoutNanosFromNow0$13(CancellationScheduler.java:293) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.common.RequestContext.lambda$makeContextAware$3(RequestContext.java:555) ~[armeria-1.17.2.jar:?]
        at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:391) [netty-transport-classes-epoll-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at java.lang.Thread.run(Unknown Source) [?:?]

Steps to Reproduce

Deploy Zipkin through Kubernetes make it work and then just redeploy the Zipkin part. It won't be able to search the same index, you'll have to set the ES_INDEX variables with a completely different index name so it will get back to work.

Expected Behaviour

Sometimes we have to redeploy the Zipkin part because we have to increase heap memory, for example. If I do it, Zipkin stops working. I could be able to delete the deployment, recreate and all should work properly, unless I change the ES_INDEX variable on purpose.

@jeff-lemos jeff-lemos added the bug label Dec 5, 2022
@jeff-lemos
Copy link
Author

Complete error log

  • Index size: 380GB
  • 5 shards, 5 replicas ( 1 per primary)

2022-12-06 20:27:02.117  WARN [6290136d5c370272/6290136d5c370272] 1 --- [orker-epoll-2-4] z.s.i.BodyIsExceptionMessage             : Unexpected error handling request.

com.linecorp.armeria.client.ResponseTimeoutException: null
        at com.linecorp.armeria.client.ResponseTimeoutException.get(ResponseTimeoutException.java:38) ~[armeria-1.17.2.jar:?]
com.linecorp.armeria.client.ResponseTimeoutException: null
        at com.linecorp.armeria.client.ResponseTimeoutException.get(ResponseTimeoutException.java:38) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.internal.common.CancellationScheduler.invokeTask(CancellationScheduler.java:469) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.internal.common.CancellationScheduler.lambda$init0$2(CancellationScheduler.java:123) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.common.RequestContext.lambda$makeContextAware$3(RequestContext.java:555) ~[armeria-1.17.2.jar:?]
        at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153) ~[netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:394) [netty-transport-classes-epoll-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at java.lang.Thread.run(Unknown Source) [?:?]

        at com.linecorp.armeria.internal.common.CancellationScheduler.invokeTask(CancellationScheduler.java:469) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.internal.common.CancellationScheduler.lambda$init0$2(CancellationScheduler.java:123) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.common.RequestContext.lambda$makeContextAware$3(RequestContext.java:555) ~[armeria-1.17.2.jar:?]
        at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153) ~[netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:394) [netty-transport-classes-epoll-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at java.lang.Thread.run(Unknown Source) [?:?]

2022-12-06 20:27:02.695  WARN [/] 1 --- [orker-epoll-2-2] z.s.i.BodyIsExceptionMessage             : Unexpected error handling request.

com.linecorp.armeria.client.ResponseTimeoutException: null
        at com.linecorp.armeria.client.ResponseTimeoutException.get(ResponseTimeoutException.java:38) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.internal.common.CancellationScheduler.invokeTask(CancellationScheduler.java:469) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.internal.common.CancellationScheduler.lambda$init0$2(CancellationScheduler.java:123) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.common.RequestContext.lambda$makeContextAware$3(RequestContext.java:555) ~[armeria-1.17.2.jar:?]
        at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153) ~[netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:394) [netty-transport-classes-epoll-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at java.lang.Thread.run(Unknown Source) [?:?]

2022-12-06 20:27:12.939  WARN [/] 1 --- [orker-epoll-2-1] z.s.i.BodyIsExceptionMessage             : Unexpected error handling request.

com.linecorp.armeria.client.ResponseTimeoutException: null
        at com.linecorp.armeria.client.ResponseTimeoutException.get(ResponseTimeoutException.java:38) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.internal.common.CancellationScheduler.invokeTask(CancellationScheduler.java:469) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.internal.common.CancellationScheduler.lambda$init0$2(CancellationScheduler.java:123) ~[armeria-1.17.2.jar:?]
        at com.linecorp.armeria.common.RequestContext.lambda$makeContextAware$3(RequestContext.java:555) ~[armeria-1.17.2.jar:?]
        at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153) ~[netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:394) [netty-transport-classes-epoll-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.78.Final.jar:4.1.78.Final]
        at java.lang.Thread.run(Unknown Source) [?:?]```


@jeff-lemos
Copy link
Author

It can't even find the service's names. It has communication, I've removed all network policies, I can find in the index through Kibana.

image

@taragurung
Copy link

I am having same issues, any updates on this issue!

@LiRuihaoA
Copy link
Contributor

I meet the same issue. armeria client default timeout is 10s, it‘s due to ES response over 10s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants