-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ResponseOps][Task Manager] "elastic-product" not present in some task manager requests #189306
Comments
Pinging @elastic/response-ops (Team:ResponseOps) |
Noting there are more of these:
These seem like different places we're doing i/o in TM that this can happen. We're going to need some interesting solution to dealing with running after "shutdown", since it does seem like we're running tasks, etc at that time. |
Another ... just noticed this, not sure how often it happens - maybe just if we exit Kibana during a claim cycle?
There's no need for the log message |
towards: #189306 This PR fixes the `Deleting current node has failed.`errors mentioned in the above issue.
"Deleting current node" logs has been fixed with #191218 |
Moving to backlog given we've fixed the newly introduced problem but eventually we'll want to investigate the other sources related to stopping tasks that are currently running. |
Just a note and linkage to PR Stop polling on Kibana shutdown - I suspect most of the cases of seeing the message about the missing header will go away when this PR is merged. I think the message was caused by some Kibana plugins removing their http context bits that add the header, during their shutdown. With the PR we should see task manager itself stop making ES calls, but I'm guessing we will see some stragglers:
The first - and other cases of task manager making ES calls after shutdown - I'm guessing we can fix, once we see them. The second is harder, but we could probably NOT log errors like this, after shutdown, if we know they are ES errors. Or maybe log in debug. Or just live with it - could be interesting diagnostic info. So ... suggest we do a comparison of before/after the PR merges, figure out if we want to do some more work and leave this issue open to track that - or we figure out the volume is low enough that it's good enough for now. |
Closing in favor of: #195817 |
Trolling the serverless logs, I came across some messages from - I think - the new discovery service mostly, and some coming through running tasks (I assume those may be from some unknown ES requests the task is making):
Deleting current node has failed. error: x-elastic-product not present or not recognized: Saved object [background-task-node/b89846bd-5560-45f6-9a11-1a46df30c279] not found
Task endpoint:user-artifact-packager "endpoint:user-artifact-packager:1.0.0" failed in attempt to run: x-elastic-product not present or not recognized: Not Found
Some telemetry code is also generating error messages with
elastic-product not present
, but there are windows of time (like 2 days, earlier this week), where the messages were not being generated.For the "Deleting current node" message, I noticed it is generated here:
kibana/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts
Lines 102 to 109 in 7db2868
which is only called from here:
kibana/x-pack/plugins/task_manager/server/plugin.ts
Lines 398 to 403 in 7db2868
I'm thinking the problem is that we're running after Kibana has basically shutdown. Here's what we should do instead:
plugin::stop
can be async, to make Kibana wait for this to finish before completing shutdown.More importantly, what happens when Kibana crashes and doesn't run this code? Should we have it try to clean up old docs still hanging around?
The text was updated successfully, but these errors were encountered: