-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fleet] Fix inability to upgrade agents from 8.10.4 -> 8.11 #170974
Conversation
🤖 GitHub commentsExpand to view the GitHub comments
Just comment with:
|
Pinging @elastic/fleet (Team:Fleet) |
@elasticmachine merge upstream |
even checking whether |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code LGTM 🚀
Yeah I think we might want to consider introducing an explicit airgapped setting to help us "fail fast" on various network calls around the app. I can look for an issue or create one to capture that particular idea, but it's not something we need to solve in this PR. |
@elasticmachine merge upstream |
💛 Build succeeded, but was flaky
Failed CI StepsTest Failures
Metrics [docs]
History
To update your PR or re-run it, just comment with: cc @kpollich |
💔 All backports failed
Manual backportTo create the backport manually run:
Questions ?Please refer to the Backport tool documentation |
💚 All backports created successfully
Note: Successful backport PRs will be merged automatically after passing CI. Questions ?Please refer to the Backport tool documentation |
…170974) ## Summary Closes elastic#169825 This PR adds logic to Fleet's `/api/agents/available_versions` endpoint that will ensure we periodically try to fetch from the live product versions API at https://www.elastic.co/api/product_versions to make sure we have eventual consistency in the list of available agent versions. Currently, Kibana relies entirely on a static file generated at build time from the above API. If the API isn't up-to-date with the latest agent version (e.g. kibana completed its build before agent), then that build of Kibana will never "see" the corresponding build of agent. This API endpoint is cached for two hours to prevent overfetching from this external API, and from constantly going out to disk to read from the agent versions file. ## To do - [x] Update unit tests - [x] Consider airgapped environments ## On airgapped environments In airgapped environments, we're going to try and fetch from the `product_versions` API and that request is going to fail. What we've seen happen in some environments is that these requests do not "fail fast" and instead wait until a network timeout is reached. I'd love to avoid that timeout case and somehow detect airgapped environments and avoid calling this API at all. However, we don't have a great deterministic way to know if someone is in an airgapped environment. The best guess I think we can make is by checking whether `xpack.fleet.registryUrl` is set to something other than `https://epr.elastic.co`. Curious if anyone has thoughts on this. ## Screenshots ![image](https://github.com/elastic/kibana/assets/6766512/0906817c-0098-4b67-8791-d06730f450f6) ![image](https://github.com/elastic/kibana/assets/6766512/59e7c132-f568-470f-b48d-53761ddc2fde) ![image](https://github.com/elastic/kibana/assets/6766512/986372df-a90f-48c3-ae24-c3012e8f7730) ## To test 1. Set up Fleet Server + ES + Kibana 2. Spin up a Fleet Server running Agent v8.11.0 3. Enroll an agent running v8.10.4 (I used multipass) 4. Verify the agent can be upgraded from the UI --------- Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com> (cherry picked from commit cd909f0) # Conflicts: # x-pack/plugins/fleet/server/services/agents/versions.ts
…170974) (#171039) # Backport This will backport the following commits from `main` to `8.11`: - [[Fleet] Fix inability to upgrade agents from 8.10.4 -> 8.11 (#170974)](#170974) <!--- Backport version: 8.9.8 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Kyle Pollich","email":"kyle.pollich@elastic.co"},"sourceCommit":{"committedDate":"2023-11-10T16:08:09Z","message":"[Fleet] Fix inability to upgrade agents from 8.10.4 -> 8.11 (#170974)\n\n## Summary\r\n\r\nCloses https://github.com/elastic/kibana/issues/169825\r\n\r\nThis PR adds logic to Fleet's `/api/agents/available_versions` endpoint\r\nthat will ensure we periodically try to fetch from the live product\r\nversions API at https://www.elastic.co/api/product_versions to make sure\r\nwe have eventual consistency in the list of available agent versions.\r\n\r\nCurrently, Kibana relies entirely on a static file generated at build\r\ntime from the above API. If the API isn't up-to-date with the latest\r\nagent version (e.g. kibana completed its build before agent), then that\r\nbuild of Kibana will never \"see\" the corresponding build of agent.\r\n\r\nThis API endpoint is cached for two hours to prevent overfetching from\r\nthis external API, and from constantly going out to disk to read from\r\nthe agent versions file.\r\n\r\n## To do\r\n- [x] Update unit tests\r\n- [x] Consider airgapped environments\r\n\r\n## On airgapped environments\r\n\r\nIn airgapped environments, we're going to try and fetch from the\r\n`product_versions` API and that request is going to fail. What we've\r\nseen happen in some environments is that these requests do not \"fail\r\nfast\" and instead wait until a network timeout is reached.\r\n\r\nI'd love to avoid that timeout case and somehow detect airgapped\r\nenvironments and avoid calling this API at all. However, we don't have a\r\ngreat deterministic way to know if someone is in an airgapped\r\nenvironment. The best guess I think we can make is by checking whether\r\n`xpack.fleet.registryUrl` is set to something other than\r\n`https://epr.elastic.co`. Curious if anyone has thoughts on this.\r\n\r\n## Screenshots\r\n\r\n\r\n![image](https://github.com/elastic/kibana/assets/6766512/0906817c-0098-4b67-8791-d06730f450f6)\r\n\r\n\r\n![image](https://github.com/elastic/kibana/assets/6766512/59e7c132-f568-470f-b48d-53761ddc2fde)\r\n\r\n\r\n![image](https://github.com/elastic/kibana/assets/6766512/986372df-a90f-48c3-ae24-c3012e8f7730)\r\n\r\n## To test\r\n\r\n1. Set up Fleet Server + ES + Kibana\r\n2. Spin up a Fleet Server running Agent v8.11.0\r\n3. Enroll an agent running v8.10.4 (I used multipass)\r\n4. Verify the agent can be upgraded from the UI\r\n\r\n---------\r\n\r\nCo-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>","sha":"cd909f03b1d71da93041a0b5c184243aa6506dea","branchLabelMapping":{"^v8.12.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:fix","Team:Fleet","backport:prev-minor","v8.12.0","v8.11.1"],"number":170974,"url":"https://github.com/elastic/kibana/pull/170974","mergeCommit":{"message":"[Fleet] Fix inability to upgrade agents from 8.10.4 -> 8.11 (#170974)\n\n## Summary\r\n\r\nCloses https://github.com/elastic/kibana/issues/169825\r\n\r\nThis PR adds logic to Fleet's `/api/agents/available_versions` endpoint\r\nthat will ensure we periodically try to fetch from the live product\r\nversions API at https://www.elastic.co/api/product_versions to make sure\r\nwe have eventual consistency in the list of available agent versions.\r\n\r\nCurrently, Kibana relies entirely on a static file generated at build\r\ntime from the above API. If the API isn't up-to-date with the latest\r\nagent version (e.g. kibana completed its build before agent), then that\r\nbuild of Kibana will never \"see\" the corresponding build of agent.\r\n\r\nThis API endpoint is cached for two hours to prevent overfetching from\r\nthis external API, and from constantly going out to disk to read from\r\nthe agent versions file.\r\n\r\n## To do\r\n- [x] Update unit tests\r\n- [x] Consider airgapped environments\r\n\r\n## On airgapped environments\r\n\r\nIn airgapped environments, we're going to try and fetch from the\r\n`product_versions` API and that request is going to fail. What we've\r\nseen happen in some environments is that these requests do not \"fail\r\nfast\" and instead wait until a network timeout is reached.\r\n\r\nI'd love to avoid that timeout case and somehow detect airgapped\r\nenvironments and avoid calling this API at all. However, we don't have a\r\ngreat deterministic way to know if someone is in an airgapped\r\nenvironment. The best guess I think we can make is by checking whether\r\n`xpack.fleet.registryUrl` is set to something other than\r\n`https://epr.elastic.co`. Curious if anyone has thoughts on this.\r\n\r\n## Screenshots\r\n\r\n\r\n![image](https://github.com/elastic/kibana/assets/6766512/0906817c-0098-4b67-8791-d06730f450f6)\r\n\r\n\r\n![image](https://github.com/elastic/kibana/assets/6766512/59e7c132-f568-470f-b48d-53761ddc2fde)\r\n\r\n\r\n![image](https://github.com/elastic/kibana/assets/6766512/986372df-a90f-48c3-ae24-c3012e8f7730)\r\n\r\n## To test\r\n\r\n1. Set up Fleet Server + ES + Kibana\r\n2. Spin up a Fleet Server running Agent v8.11.0\r\n3. Enroll an agent running v8.10.4 (I used multipass)\r\n4. Verify the agent can be upgraded from the UI\r\n\r\n---------\r\n\r\nCo-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>","sha":"cd909f03b1d71da93041a0b5c184243aa6506dea"}},"sourceBranch":"main","suggestedTargetBranches":["8.11"],"targetPullRequestStates":[{"branch":"main","label":"v8.12.0","labelRegex":"^v8.12.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/170974","number":170974,"mergeCommit":{"message":"[Fleet] Fix inability to upgrade agents from 8.10.4 -> 8.11 (#170974)\n\n## Summary\r\n\r\nCloses https://github.com/elastic/kibana/issues/169825\r\n\r\nThis PR adds logic to Fleet's `/api/agents/available_versions` endpoint\r\nthat will ensure we periodically try to fetch from the live product\r\nversions API at https://www.elastic.co/api/product_versions to make sure\r\nwe have eventual consistency in the list of available agent versions.\r\n\r\nCurrently, Kibana relies entirely on a static file generated at build\r\ntime from the above API. If the API isn't up-to-date with the latest\r\nagent version (e.g. kibana completed its build before agent), then that\r\nbuild of Kibana will never \"see\" the corresponding build of agent.\r\n\r\nThis API endpoint is cached for two hours to prevent overfetching from\r\nthis external API, and from constantly going out to disk to read from\r\nthe agent versions file.\r\n\r\n## To do\r\n- [x] Update unit tests\r\n- [x] Consider airgapped environments\r\n\r\n## On airgapped environments\r\n\r\nIn airgapped environments, we're going to try and fetch from the\r\n`product_versions` API and that request is going to fail. What we've\r\nseen happen in some environments is that these requests do not \"fail\r\nfast\" and instead wait until a network timeout is reached.\r\n\r\nI'd love to avoid that timeout case and somehow detect airgapped\r\nenvironments and avoid calling this API at all. However, we don't have a\r\ngreat deterministic way to know if someone is in an airgapped\r\nenvironment. The best guess I think we can make is by checking whether\r\n`xpack.fleet.registryUrl` is set to something other than\r\n`https://epr.elastic.co`. Curious if anyone has thoughts on this.\r\n\r\n## Screenshots\r\n\r\n\r\n![image](https://github.com/elastic/kibana/assets/6766512/0906817c-0098-4b67-8791-d06730f450f6)\r\n\r\n\r\n![image](https://github.com/elastic/kibana/assets/6766512/59e7c132-f568-470f-b48d-53761ddc2fde)\r\n\r\n\r\n![image](https://github.com/elastic/kibana/assets/6766512/986372df-a90f-48c3-ae24-c3012e8f7730)\r\n\r\n## To test\r\n\r\n1. Set up Fleet Server + ES + Kibana\r\n2. Spin up a Fleet Server running Agent v8.11.0\r\n3. Enroll an agent running v8.10.4 (I used multipass)\r\n4. Verify the agent can be upgraded from the UI\r\n\r\n---------\r\n\r\nCo-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>","sha":"cd909f03b1d71da93041a0b5c184243aa6506dea"}},{"branch":"8.11","label":"v8.11.1","labelRegex":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
This PR haven't made it into the latest BC of 8.11.1. Updating the labels. |
## Summary The 8.11.1 release notes included #170974 which didn't actually land in 8.11.1. We shipped BC2 of 8.11.1 which was built from this Kibana commit: https://github.com/elastic/kibana/commits/09feaf416f986b239b8e8ad95ecdda0f9d56ebec. The PR was not merged until after this commit, so the bug is still present (though [mitigated slightly](#169825 (comment))) in 8.11.1. This PR removes the erroneous release note from the 8.11.1 release notes. How can we make sure the fix _does_ get included in the eventual 8.11.2 release notes?
…71200) ## Summary The 8.11.1 release notes included elastic#170974 which didn't actually land in 8.11.1. We shipped BC2 of 8.11.1 which was built from this Kibana commit: https://github.com/elastic/kibana/commits/09feaf416f986b239b8e8ad95ecdda0f9d56ebec. The PR was not merged until after this commit, so the bug is still present (though [mitigated slightly](elastic#169825 (comment))) in 8.11.1. This PR removes the erroneous release note from the 8.11.1 release notes. How can we make sure the fix _does_ get included in the eventual 8.11.2 release notes? (cherry picked from commit 480fcef)
…71200) (#171249) # Backport This will backport the following commits from `main` to `8.11`: - [[Fleet] Remove agent upgrade fix from 8.11.1 release notes (#171200)](#171200) <!--- Backport version: 8.9.7 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Kyle Pollich","email":"kyle.pollich@elastic.co"},"sourceCommit":{"committedDate":"2023-11-14T21:57:06Z","message":"[Fleet] Remove agent upgrade fix from 8.11.1 release notes (#171200)\n\n## Summary\r\n\r\nThe 8.11.1 release notes included #170974 which didn't actually land in\r\n8.11.1. We shipped BC2 of 8.11.1 which was built from this Kibana\r\ncommit:\r\nhttps://github.com/elastic/kibana/commits/09feaf416f986b239b8e8ad95ecdda0f9d56ebec.\r\nThe PR was not merged until after this commit, so the bug is still\r\npresent (though [mitigated\r\nslightly](https://github.com/elastic/kibana/issues/169825#issuecomment-1808453016))\r\nin 8.11.1.\r\n\r\nThis PR removes the erroneous release note from the 8.11.1 release\r\nnotes. How can we make sure the fix _does_ get included in the eventual\r\n8.11.2 release notes?","sha":"480fcef6985b21c1a3c22d4657aeefc761fec5a3","branchLabelMapping":{"^v8.12.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["Team:Docs","release_note:skip","backport:prev-minor","v8.12.0","v8.11.2"],"number":171200,"url":"https://github.com/elastic/kibana/pull/171200","mergeCommit":{"message":"[Fleet] Remove agent upgrade fix from 8.11.1 release notes (#171200)\n\n## Summary\r\n\r\nThe 8.11.1 release notes included #170974 which didn't actually land in\r\n8.11.1. We shipped BC2 of 8.11.1 which was built from this Kibana\r\ncommit:\r\nhttps://github.com/elastic/kibana/commits/09feaf416f986b239b8e8ad95ecdda0f9d56ebec.\r\nThe PR was not merged until after this commit, so the bug is still\r\npresent (though [mitigated\r\nslightly](https://github.com/elastic/kibana/issues/169825#issuecomment-1808453016))\r\nin 8.11.1.\r\n\r\nThis PR removes the erroneous release note from the 8.11.1 release\r\nnotes. How can we make sure the fix _does_ get included in the eventual\r\n8.11.2 release notes?","sha":"480fcef6985b21c1a3c22d4657aeefc761fec5a3"}},"sourceBranch":"main","suggestedTargetBranches":["8.11"],"targetPullRequestStates":[{"branch":"main","label":"v8.12.0","labelRegex":"^v8.12.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/171200","number":171200,"mergeCommit":{"message":"[Fleet] Remove agent upgrade fix from 8.11.1 release notes (#171200)\n\n## Summary\r\n\r\nThe 8.11.1 release notes included #170974 which didn't actually land in\r\n8.11.1. We shipped BC2 of 8.11.1 which was built from this Kibana\r\ncommit:\r\nhttps://github.com/elastic/kibana/commits/09feaf416f986b239b8e8ad95ecdda0f9d56ebec.\r\nThe PR was not merged until after this commit, so the bug is still\r\npresent (though [mitigated\r\nslightly](https://github.com/elastic/kibana/issues/169825#issuecomment-1808453016))\r\nin 8.11.1.\r\n\r\nThis PR removes the erroneous release note from the 8.11.1 release\r\nnotes. How can we make sure the fix _does_ get included in the eventual\r\n8.11.2 release notes?","sha":"480fcef6985b21c1a3c22d4657aeefc761fec5a3"}},{"branch":"8.11","label":"v8.11.2","labelRegex":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Kyle Pollich <kyle.pollich@elastic.co>
Summary
Closes #169825
This PR adds logic to Fleet's
/api/agents/available_versions
endpoint that will ensure we periodically try to fetch from the live product versions API at https://www.elastic.co/api/product_versions to make sure we have eventual consistency in the list of available agent versions.Currently, Kibana relies entirely on a static file generated at build time from the above API. If the API isn't up-to-date with the latest agent version (e.g. kibana completed its build before agent), then that build of Kibana will never "see" the corresponding build of agent.
This API endpoint is cached for two hours to prevent overfetching from this external API, and from constantly going out to disk to read from the agent versions file.
To do
On airgapped environments
In airgapped environments, we're going to try and fetch from the
product_versions
API and that request is going to fail. What we've seen happen in some environments is that these requests do not "fail fast" and instead wait until a network timeout is reached.I'd love to avoid that timeout case and somehow detect airgapped environments and avoid calling this API at all. However, we don't have a great deterministic way to know if someone is in an airgapped environment. The best guess I think we can make is by checking whether
xpack.fleet.registryUrl
is set to something other thanhttps://epr.elastic.co
. Curious if anyone has thoughts on this.Screenshots
To test