-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fleet]: Unhealthy agent output badge is not removed on editing incorrect output when agent is not connected. #3334
[Fleet]: Unhealthy agent output badge is not removed on editing incorrect output when agent is not connected. #3334
Comments
Pinging @elastic/fleet (Team:Fleet) |
@manishgupta-qasource Please review. |
Secondary review for this ticket is Done |
…pdated time (#177685) ## Summary Closes https://github.com/elastic/kibana/issues/174008 Added a filter when querying remote ES output health status, to only return results after the last update time of the output (`updated_at` field of the SO). This makes the health status reporting more accurate, so old statuses are not staying on the UI, only latest status after the last update. If the output query errors out or the `updated_at` field is not present, the filter is omitted. To verify: - create a remote ES output (can be the same as the local ES), use it as monitoring output of an agent policy - enroll an agent to this agent policy - update output to use an invalid host url - wait until the remote ES output is showing up with error state on UI - stop the Fleet-server - update the remote ES output to use a correct host url - wait until the remote ES output status is cleared on the UI - start Fleet-server, wait until the agent checks in again (can be a few minutes) - verify that the remote ES output status shows up as healthy on the UI Invalid url: <img width="581" alt="image" src="https://github.com/elastic/kibana/assets/90178898/b8a98cb1-4a1b-4d74-b260-b95bf8eaac62"> Fleet-server stopped and updated to valid url: <img width="1133" alt="image" src="https://github.com/elastic/kibana/assets/90178898/0e8a047f-48d8-4a3e-90e5-9a2ae1c2f874"> Fleet-server restarted: <img width="1131" alt="image" src="https://github.com/elastic/kibana/assets/90178898/0cf642e5-b26f-41d7-ad45-acc2c6c6111f"> ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
…pdated time (elastic#177685) ## Summary Closes https://github.com/elastic/kibana/issues/174008 Added a filter when querying remote ES output health status, to only return results after the last update time of the output (`updated_at` field of the SO). This makes the health status reporting more accurate, so old statuses are not staying on the UI, only latest status after the last update. If the output query errors out or the `updated_at` field is not present, the filter is omitted. To verify: - create a remote ES output (can be the same as the local ES), use it as monitoring output of an agent policy - enroll an agent to this agent policy - update output to use an invalid host url - wait until the remote ES output is showing up with error state on UI - stop the Fleet-server - update the remote ES output to use a correct host url - wait until the remote ES output status is cleared on the UI - start Fleet-server, wait until the agent checks in again (can be a few minutes) - verify that the remote ES output status shows up as healthy on the UI Invalid url: <img width="581" alt="image" src="https://github.com/elastic/kibana/assets/90178898/b8a98cb1-4a1b-4d74-b260-b95bf8eaac62"> Fleet-server stopped and updated to valid url: <img width="1133" alt="image" src="https://github.com/elastic/kibana/assets/90178898/0e8a047f-48d8-4a3e-90e5-9a2ae1c2f874"> Fleet-server restarted: <img width="1131" alt="image" src="https://github.com/elastic/kibana/assets/90178898/0cf642e5-b26f-41d7-ad45-acc2c6c6111f"> ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios (cherry picked from commit 2005cef)
… last updated time (#177685) (#177711) # Backport This will backport the following commits from `main` to `8.13`: - [[Fleet] only show remote ES output health status if later than last updated time (#177685)](#177685) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Julia Bardi","email":"90178898+juliaElastic@users.noreply.github.com"},"sourceCommit":{"committedDate":"2024-02-23T13:18:06Z","message":"[Fleet] only show remote ES output health status if later than last updated time (#177685)\n\n## Summary\r\n\r\nCloses https://github.com/elastic/kibana/issues/174008\r\n\r\nAdded a filter when querying remote ES output health status, to only\r\nreturn results after the last update time of the output (`updated_at`\r\nfield of the SO).\r\nThis makes the health status reporting more accurate, so old statuses\r\nare not staying on the UI, only latest status after the last update.\r\nIf the output query errors out or the `updated_at` field is not present,\r\nthe filter is omitted.\r\n\r\n\r\nTo verify:\r\n- create a remote ES output (can be the same as the local ES), use it as\r\nmonitoring output of an agent policy\r\n- enroll an agent to this agent policy\r\n- update output to use an invalid host url\r\n- wait until the remote ES output is showing up with error state on UI\r\n- stop the Fleet-server\r\n- update the remote ES output to use a correct host url\r\n- wait until the remote ES output status is cleared on the UI\r\n- start Fleet-server, wait until the agent checks in again (can be a few\r\nminutes)\r\n- verify that the remote ES output status shows up as healthy on the UI\r\n\r\nInvalid url:\r\n<img width=\"581\" alt=\"image\"\r\nsrc=\"https://github.com/elastic/kibana/assets/90178898/b8a98cb1-4a1b-4d74-b260-b95bf8eaac62\">\r\n\r\nFleet-server stopped and updated to valid url:\r\n<img width=\"1133\" alt=\"image\"\r\nsrc=\"https://github.com/elastic/kibana/assets/90178898/0e8a047f-48d8-4a3e-90e5-9a2ae1c2f874\">\r\n\r\nFleet-server restarted:\r\n<img width=\"1131\" alt=\"image\"\r\nsrc=\"https://github.com/elastic/kibana/assets/90178898/0cf642e5-b26f-41d7-ad45-acc2c6c6111f\">\r\n\r\n\r\n### Checklist\r\n\r\n- [x] [Unit or functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere updated or added to match the most common scenarios","sha":"2005cef574a083ceb15c568b6470a6c15d90ca0b","branchLabelMapping":{"^v8.14.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:fix","Team:Fleet","backport:prev-minor","v8.14.0"],"title":"[Fleet] only show remote ES output health status if later than last updated time","number":177685,"url":"https://github.com/elastic/kibana/pull/177685","mergeCommit":{"message":"[Fleet] only show remote ES output health status if later than last updated time (#177685)\n\n## Summary\r\n\r\nCloses https://github.com/elastic/kibana/issues/174008\r\n\r\nAdded a filter when querying remote ES output health status, to only\r\nreturn results after the last update time of the output (`updated_at`\r\nfield of the SO).\r\nThis makes the health status reporting more accurate, so old statuses\r\nare not staying on the UI, only latest status after the last update.\r\nIf the output query errors out or the `updated_at` field is not present,\r\nthe filter is omitted.\r\n\r\n\r\nTo verify:\r\n- create a remote ES output (can be the same as the local ES), use it as\r\nmonitoring output of an agent policy\r\n- enroll an agent to this agent policy\r\n- update output to use an invalid host url\r\n- wait until the remote ES output is showing up with error state on UI\r\n- stop the Fleet-server\r\n- update the remote ES output to use a correct host url\r\n- wait until the remote ES output status is cleared on the UI\r\n- start Fleet-server, wait until the agent checks in again (can be a few\r\nminutes)\r\n- verify that the remote ES output status shows up as healthy on the UI\r\n\r\nInvalid url:\r\n<img width=\"581\" alt=\"image\"\r\nsrc=\"https://github.com/elastic/kibana/assets/90178898/b8a98cb1-4a1b-4d74-b260-b95bf8eaac62\">\r\n\r\nFleet-server stopped and updated to valid url:\r\n<img width=\"1133\" alt=\"image\"\r\nsrc=\"https://github.com/elastic/kibana/assets/90178898/0e8a047f-48d8-4a3e-90e5-9a2ae1c2f874\">\r\n\r\nFleet-server restarted:\r\n<img width=\"1131\" alt=\"image\"\r\nsrc=\"https://github.com/elastic/kibana/assets/90178898/0cf642e5-b26f-41d7-ad45-acc2c6c6111f\">\r\n\r\n\r\n### Checklist\r\n\r\n- [x] [Unit or functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere updated or added to match the most common scenarios","sha":"2005cef574a083ceb15c568b6470a6c15d90ca0b"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v8.14.0","branchLabelMappingKey":"^v8.14.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/177685","number":177685,"mergeCommit":{"message":"[Fleet] only show remote ES output health status if later than last updated time (#177685)\n\n## Summary\r\n\r\nCloses https://github.com/elastic/kibana/issues/174008\r\n\r\nAdded a filter when querying remote ES output health status, to only\r\nreturn results after the last update time of the output (`updated_at`\r\nfield of the SO).\r\nThis makes the health status reporting more accurate, so old statuses\r\nare not staying on the UI, only latest status after the last update.\r\nIf the output query errors out or the `updated_at` field is not present,\r\nthe filter is omitted.\r\n\r\n\r\nTo verify:\r\n- create a remote ES output (can be the same as the local ES), use it as\r\nmonitoring output of an agent policy\r\n- enroll an agent to this agent policy\r\n- update output to use an invalid host url\r\n- wait until the remote ES output is showing up with error state on UI\r\n- stop the Fleet-server\r\n- update the remote ES output to use a correct host url\r\n- wait until the remote ES output status is cleared on the UI\r\n- start Fleet-server, wait until the agent checks in again (can be a few\r\nminutes)\r\n- verify that the remote ES output status shows up as healthy on the UI\r\n\r\nInvalid url:\r\n<img width=\"581\" alt=\"image\"\r\nsrc=\"https://github.com/elastic/kibana/assets/90178898/b8a98cb1-4a1b-4d74-b260-b95bf8eaac62\">\r\n\r\nFleet-server stopped and updated to valid url:\r\n<img width=\"1133\" alt=\"image\"\r\nsrc=\"https://github.com/elastic/kibana/assets/90178898/0e8a047f-48d8-4a3e-90e5-9a2ae1c2f874\">\r\n\r\nFleet-server restarted:\r\n<img width=\"1131\" alt=\"image\"\r\nsrc=\"https://github.com/elastic/kibana/assets/90178898/0cf642e5-b26f-41d7-ad45-acc2c6c6111f\">\r\n\r\n\r\n### Checklist\r\n\r\n- [x] [Unit or functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere updated or added to match the most common scenarios","sha":"2005cef574a083ceb15c568b6470a6c15d90ca0b"}}]}] BACKPORT--> Co-authored-by: Julia Bardi <90178898+juliaElastic@users.noreply.github.com>
…pdated time (elastic#177685) ## Summary Closes https://github.com/elastic/kibana/issues/174008 Added a filter when querying remote ES output health status, to only return results after the last update time of the output (`updated_at` field of the SO). This makes the health status reporting more accurate, so old statuses are not staying on the UI, only latest status after the last update. If the output query errors out or the `updated_at` field is not present, the filter is omitted. To verify: - create a remote ES output (can be the same as the local ES), use it as monitoring output of an agent policy - enroll an agent to this agent policy - update output to use an invalid host url - wait until the remote ES output is showing up with error state on UI - stop the Fleet-server - update the remote ES output to use a correct host url - wait until the remote ES output status is cleared on the UI - start Fleet-server, wait until the agent checks in again (can be a few minutes) - verify that the remote ES output status shows up as healthy on the UI Invalid url: <img width="581" alt="image" src="https://github.com/elastic/kibana/assets/90178898/b8a98cb1-4a1b-4d74-b260-b95bf8eaac62"> Fleet-server stopped and updated to valid url: <img width="1133" alt="image" src="https://github.com/elastic/kibana/assets/90178898/0e8a047f-48d8-4a3e-90e5-9a2ae1c2f874"> Fleet-server restarted: <img width="1131" alt="image" src="https://github.com/elastic/kibana/assets/90178898/0cf642e5-b26f-41d7-ad45-acc2c6c6111f"> ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
We have revalidated this issue on latest 8.13.0 BC3 kibana cloud environment and found it still reproducible. Observations:
Build details: Screen Recording: Agents.-.Fleet.-.Elastic.-.Google.Chrome.2024-03-05.17-50-57.mp4Hence we are reopening this issue. Thanks! |
I had a look, and the issue is that the remote ES config is only updated on agent checkin/ack, so if the agent is stopped, the fleet-server monitor doesn't receive the new (correct) config, and incorrectly keeps doing the health check with the old (incorrect) config. I'll take a look how to fix this. |
Hi Team, We have revalidated this issue on latest 8.13.0 BC7 kibana cloud environment and found it fixed now. Observations:
Screen Recording: Agents.-.Fleet.-.Elastic.-.Google.Chrome.2024-03-26.12-32-13.mp4Build details: Hence, we are marking this issue as QA:Validated. |
Kibana Build details:
Host OS: All
Preconditions:
Steps to reproduce:
Screen Recording:
Agents.-.Fleet.-.Elastic.-.Google.Chrome.2023-12-27.19-07-53.mp4
Settings.-.Fleet.-.Elastic.-.Google.Chrome.2023-12-27.19-09-27.mp4
Expected Result:
Unhealthy agent output badge should be removed on editing incorrect output when agent is not connected and new status should be updated once agent gets connected.
Feature:
elastic/kibana#104986
The text was updated successfully, but these errors were encountered: