-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Monitoring indices fails #84041
Comments
Pinging @elastic/stack-monitoring (Team:Monitoring) |
I'd suggest opening this ticket in the the Elasticsearch repo as these errors are no Kibana-related. |
Opened a ticket in the Elasticsearch repo: |
I've would like to reopen this issue, to clarify how I would appreciate the clarification of the Kibana team or the Monitoring team on the following:
|
This is most likely related to #76015 after further research. In preparation for #73864, we are now reading from My assumption here is the size of @mayya-sharipova Is there anyway to confirm or deny this with the existing use cases? |
There is something we can do on our side to help and that is migrate to using server-side pagination which should greatly increase the load time of this page. I opened #87159 to track this. FWIW, we made this change for the ES nodes listing and Logstash pipelines listing pages and saw substantial improvements in terms of load. |
@chrisronline Thank you for your update. For OOM case in elastic/sdh-elasticsearch#3680, the request was for the index For https://github.com/elastic/sdh-elasticsearch/issues/3681, I have asked @jasonyoum to confirm the size of |
That's right. Can we also find out the approximate size of any |
For the size of metricbeat indices in the cluster of https://github.com/elastic/sdh-elasticsearch/issues/3681, I am copying Jason's response here: Based on the previously provided diagnostic result, total shard number of
The size of metricbeat indices for elastic/sdh-elasticsearch#3680 cluster is following: |
One thing we can have the customer do, as a temporary measure, to test the root cause is to have them manually configure I wouldn't recommend keeping the config around long-term because it will affect how the stack monitoring UI will work in the near future, but we can at least use that for testing for now. |
Thank you for all updates @chrisronline @mayya-sharipova The case was closed after the customer agreed to upgrade to latest version for the fix. I am closing this issue. Thanks again! |
Kibana version: 7.10
Elasticsearch version: 7.10
Server OS version: Debian Buster
Browser version: Firefox 83.0
Browser OS version: Ubuntu Xenial
Original install method (e.g. download page, yum, from source, etc.): apt
Describe the bug: Try to view Management -> Stack Monitoring ->Indices fails and leads to a lot of entries in the elasticsearch logs since change from legacy monitoring to monitoring with metricbeat (7.9.1)
Steps to reproduce:
After some time I get then the following entries in the elasticsearch logs (on several nodes):
[
Expected behavior: Sho an overview of the indices
Screenshots (if relevant):
Errors in browser console (if relevant):
Provide logs and/or server output (if relevant):
Output of one elasticsearch log:
2020-11-23T07:40:17,219][WARN ][o.e.t.InboundHandler ] [node-2-a] handling inbound transport message [InboundMessage{Header{3377158}{7.10.0}{33130325}{false}{false}{false}{false}{NO_ACTION_NAME_FOR_RESPONSES}}] took [7750ms] which is above the warn threshold of [5000ms]
[2020-11-23T07:41:49,122][INFO ][o.e.m.j.JvmGcMonitorService] [node-2-a] [gc][243275] overhead, spent [339ms] collecting in the last [1s]
[2020-11-23T07:41:55,957][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [node-2-a] uncaught exception in thread [elasticsearch[node-2-a][search][T#393]]
org.elasticsearch.tasks.TaskCancelledException: The parent task was cancelled, shouldn't start any child tasks
at org.elasticsearch.tasks.TaskManager$CancellableTaskHolder.registerChildNode(TaskManager.java:522) ~[elasticsearch-7.10.0.jar:7.10.0]
at org.elasticsearch.tasks.TaskManager.registerChildNode(TaskManager.java:213) ~[elasticsearch-7.10.0.jar:7.10.0]
at org.elasticsearch.action.support.TransportAction.registerChildNode(TransportAction.java:56) ~[elasticsearch-7.10.0.jar:7.10.0]
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:75) ~[elasticsearch-7.10.0.jar:7.10.0]
at org.elasticsearch.client.node.NodeClient.executeLocally(NodeClient.java:86) ~[elasticsearch-7.10.0.jar:7.10.0]
at org.elasticsearch.client.node.NodeClient.doExecute(NodeClient.java:75) ~[elasticsearch-7.10.0.jar:7.10.0]
at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:412) ~[elasticsearch-7.10.0.jar:7.10.0]
at org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:545) ~[elasticsearch-7.10.0.jar:7.10.0]
at org.elasticsearch.action.search.TransportMultiSearchAction.executeSearch(TransportMultiSearchAction.java:149) ~[elasticsearch-7.10.0.jar:7.10.0]
at org.elasticsearch.action.search.TransportMultiSearchAction$1.handleResponse(TransportMultiSearchAction.java:172) ~[elasticsearch-7.10.0.jar:7.10.0]
at org.elasticsearch.action.search.TransportMultiSearchAction$1.onFailure(TransportMultiSearchAction.java:157) ~[elasticsearch-7.10.0.jar:7.10.0]
at org.elasticsearch.action.support.TransportAction$1.onFailure(TransportAction.java:98) ~[elasticsearch-7.10.0.jar:7.10.0]
at org.elasticsearch.action.support.ContextPreservingActionListener.onFailure(ContextPreservingActionListener.java:50) ~[elasticsearch-7.10.0.jar:7.10.0]
at org.elasticsearch.action.ActionListener$5.onFailure(ActionListener.java:258) ~[elasticsearch-7.10.0.jar:7.10.0]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.raisePhaseFailure(AbstractSearchAsyncAction.java:594) ~[elasticsearch-7.10.0.jar:7.10.0]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:568) ~[elasticsearch-7.10.0.jar:7.10.0]
at org.elasticsearch.action.search.FetchSearchPhase$1.onFailure(FetchSearchPhase.java:100) ~[elasticsearch-7.10.0.jar:7.10.0]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:39) ~[elasticsearch-7.10.0.jar:7.10.0]
at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:44) ~[elasticsearch-7.10.0.jar:7.10.0]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:737) ~[elasticsearch-7.10.0.jar:7.10.0]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.10.0.jar:7.10.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) ~[?:?]
at java.lang.Thread.run(Thread.java:832) [?:?]
last message repeated about 80-100 times...
Any additional context:
Sometimes the nodes get disconnected from cluster then. One time I had to do a full restart of the cluster.
Cluster has three nodes, 972 indices, 1944 Shards, 12 TB Data
The text was updated successfully, but these errors were encountered: