Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] Installing an uninstalled bundled package should not require a reachable EPR in integrations UI #136617

Closed
axw opened this issue Jul 19, 2022 · 11 comments · Fixed by #202435
Assignees
Labels
bug Fixes for quality problems that affect the customer experience Team:Fleet Team label for Observability Data Collection Fleet team

Comments

@axw
Copy link
Member

axw commented Jul 19, 2022

Some integration packages, like APM, are bundled with Kibana. These can be installed using the Fleet preconfiguration settings. I would also expect them to be usable when not using preconfiguration, and installing the package via the Integrations UI.

When navigating to the Integrations UI with EPR inaccessible, the UI presents a spinner and the well-defined Elastic integrations at the top:

image

If you click on Elastic APM, it takes you to the APM tutorial, and you would have to navigate back to the Integrations UI to install the package. The link back to the Integrations UI spins, because EPR is inaccessible:

image

Eventually it loads, and then you get taken to the APM integration, which fails to load:

image

@axw axw added bug Fixes for quality problems that affect the customer experience Team:Fleet Team label for Observability Data Collection Fleet team labels Jul 19, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@jen-huang
Copy link
Contributor

@axw Thanks for filing! We actually had this tracked in #134326 but I'm going to close that one in favor of yours as yours also includes the problem on the package detail page.

@axw
Copy link
Member Author

axw commented Jul 22, 2022

@jen-huang not sure if it warrants two separate issues or not, but I think they're a bit different. #134326 is about listing installed integrations without EPR access, whereas this issue is about listing uninstalled but bundled packages without EPR access.

@juliaElastic
Copy link
Contributor

juliaElastic commented Dec 8, 2022

I found a similar issue mentioned here that package assets (icons, readme) are not loading on Integration Overview page if the package is not in EPR e.g. apm-8.6.0

@criamico
Copy link
Contributor

criamico commented Feb 5, 2024

I tried to reproduce locally with following procedure on Kibana 8.12.0 (dev).

  • Set up kibana.dev.yml:
    • Configure xpack.fleet.registryUrl to an unreacheable host
    • Make sure that APM and other bundled packages are not present in preconfiguration
  • Dowloaded apm and endpoint from EPR and copied them in kibana/x-pack/plugins/fleet/target/bundled_packages. This should ensure that the bundled packages are present.
  • Go to an uninstalled bundled package page, in my case endpoint (apm was already installed). The pageapp/integrations/detail/endpoint-8.12.0/overview loads properly with no errors.
  • Try to add the integration. The integration policy editor loads properly:
    Screenshot 2024-02-05 at 16 47 56
  • Click Save and Continue. Here's where the process fails, but in a different way than what's described above. The page spins for a while and in the logs I can see Failed to fetch latest version of endpoint from registry: Error connecting to package registry: request to http://localhost:8080/search?package=endpoint&prerelease=false&kibana.version=8.13.0 failed, reason:. Then it seems to install but fails after a while with:
[2024-02-05T16:26:06.561+01:00][DEBUG][plugins.fleet] retrieved installed package endpoint-8.12.0 from ES
[2024-02-05T16:26:06.561+01:00][DEBUG][plugins.fleet] retrieved installed package endpoint-8.12.0 from ES
[2024-02-05T16:26:06.565+01:00][DEBUG][plugins.fleet] Creating new package policy
[2024-02-05T16:26:06.565+01:00][DEBUG][plugins.fleet] Creating new package policy
[2024-02-05T16:26:06.565+01:00][DEBUG][plugins.fleet] Running 6 external callbacks for packagePolicyCreate
[2024-02-05T16:26:06.565+01:00][DEBUG][plugins.fleet] Running 6 external callbacks for packagePolicyCreate
[2024-02-05T16:26:07.090+01:00][ERROR][plugins.fleet] An integration policy with the name New already exists. Please rename it or choose a different name.
[2024-02-05T16:26:07.090+01:00][ERROR][plugins.fleet] An integration policy with the name New already exists. Please rename it or choose a different name.

This is also visible in the page:
Screenshot 2024-02-05 at 16 26 16

So it seems that we get past the previous error described above, which could be explained by the fact that this issue is quite old and there were changes done to this area of the code. This error is now about the package policy name already existing, but I'm not sure where's coming from. I'll dig a little bit further.

@kpollich Are my assumptions about the configuration correct? I'm not sure if my results are different because of different configuration or this code path just changed.

@criamico
Copy link
Contributor

criamico commented Feb 5, 2024

Also, by reading the related issues it seems that a very good enhancement could be done by implementing what's described here. This would allow the requests to be shortened and getting to the error much faster. In my test it took at least a couple of minutes to get to the above error, and the whole navigation was really slow.

@kpollich
Copy link
Member

kpollich commented Feb 5, 2024

Also, by reading the related issues it seems that a very good enhancement could be done by implementing what's described here. This would allow the requests to be shortened and getting to the error much faster. In my test it took at least a couple of minutes to get to the above error, and the whole navigation was really slow.

Yep this is accurate. We do have the xpack.fleet.isAirgapped flag now that should be usable for the same purpose.

Try to add the integration. The integration policy editor loads properly:

The only thing I think that differs in your steps to reproduce and what was reported initially here is that other parts of the integrations app were causing issues. Namely the integration details page. I think this may have changed recently too, as we use the packageInfo object sourced from the package info as the "source of truth" for much of this UI now.

@criamico
Copy link
Contributor

criamico commented Feb 7, 2024

Summarizing here what I found so far. With the setup described in #136617 (comment), I tried to install apm via API on an existing agent policy:

POST kbn:/api/fleet/package_policies
{
  "name": "apm-new-8",
  "policy_id": "4a38a321-7020-4cc2-ad9a-ac57ac33c499",
  "package": {
    "name": "apm",
    "version": "8.12.0-preview-1701143518"
  }
}

The request fails with 409 conflict:

 An integration policy with the name apm-new-8 already exists. Please rename it or choose a different name.

Screenshot 2024-02-07 at 17 28 36

It's interesting that the package policy appears as installed in the UI, even though the request completes with the error.
Screenshot 2024-02-07 at 17 38 53

Investigation

The reason seems to be that PackagePolicyService.create gets called at least twice with the same package policy name, hence causing the conflict.

I checked the external callbacks visible in the logs:
Screenshot 2024-02-07 at 12 27 48

@kpollich pointed me to the apm callbacks that trigger a second update of the package policy after creation, but it doesn't explain why the whole creation process happens again after the first time.

I'm attaching Kibana logs below since they're too long to be pasted here. The line Creating new package policy (pointing here) gets called in two distinct places and the second time the policy creation fails.

kibana_logs.json

Currently I'm not sure of the root cause of this error; The same process happens when the registry is reachable but no conflict error happens.

@kpollich
Copy link
Member

kpollich commented Feb 9, 2024

@criamico - Thank for investigating this. I'm trying to get a repro locally but running into some trouble. Even with the bundled packages set up a basic GET request for a package spins for several minutes until I eventually get a 503, e.g.

GET kbn:/api/fleet/epm/packages/apm/8.13.0
{
  "statusCode": 502,
  "error": "Bad Gateway",
  "message": "Error connecting to package registry: request to http://localhost:8080/package/apm/8.13.0/ failed, reason: "
}

image

The bundled packages seem to come back properly, so I'm not sure what's breaking. I'll continue to troubleshoot, but curious if maybe I've missed a step.

@kpollich
Copy link
Member

I think this is worth looking into again with fresh eyes. Moving out of blocked.

kpollich added a commit that referenced this issue Aug 20, 2024
…that depend on EPR (#190722)

## Summary

Relates to #136617

For APIs that depend on Fleet connecting to Elastic Package Registry,
Fleet already retries the connections to EPR on the server side. This
results in a situation where, when EPR is unreachable, the requests is
retried several times on the server side, and then the request is
retried again on the client-side by react-query. This results in very
long running API requests.

Since the server-side retries generally cover any kind of flakiness
here, disabling the retry logic on the front-end seems sensible. ~I've
also reduced the number of retries on the server side from 5 to 3 to
help fail faster here.~ I walked back the retry change after some test
failures, and I don't think it makes a big enough impact to justify
changing.

## To test

Set `xpack.fleet.registryUrl: 127.0.0.1:8080` with nothing running

## Before

The requests spin for a very long time.


https://github.com/user-attachments/assets/e4fd77ee-b36c-4965-9f71-e5b3e195f75e

## After

The requests stop spinning after a few seconds as the retries won't
loop.


https://github.com/user-attachments/assets/82adc595-1bc4-4269-8501-2eb83525ad15

cc @shahzad31
kibanamachine pushed a commit to kibanamachine/kibana that referenced this issue Aug 20, 2024
…that depend on EPR (elastic#190722)

## Summary

Relates to elastic#136617

For APIs that depend on Fleet connecting to Elastic Package Registry,
Fleet already retries the connections to EPR on the server side. This
results in a situation where, when EPR is unreachable, the requests is
retried several times on the server side, and then the request is
retried again on the client-side by react-query. This results in very
long running API requests.

Since the server-side retries generally cover any kind of flakiness
here, disabling the retry logic on the front-end seems sensible. ~I've
also reduced the number of retries on the server side from 5 to 3 to
help fail faster here.~ I walked back the retry change after some test
failures, and I don't think it makes a big enough impact to justify
changing.

## To test

Set `xpack.fleet.registryUrl: 127.0.0.1:8080` with nothing running

## Before

The requests spin for a very long time.

https://github.com/user-attachments/assets/e4fd77ee-b36c-4965-9f71-e5b3e195f75e

## After

The requests stop spinning after a few seconds as the retries won't
loop.

https://github.com/user-attachments/assets/82adc595-1bc4-4269-8501-2eb83525ad15

cc @shahzad31

(cherry picked from commit cf3149e)
kibanamachine referenced this issue Aug 20, 2024
…o APIs that depend on EPR (#190722) (#190816)

# Backport

This will backport the following commits from `main` to `8.15`:
- [[Fleet] Remove duplicative retries from client-side requests to APIs
that depend on EPR
(#190722)](#190722)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Kyle
Pollich","email":"kyle.pollich@elastic.co"},"sourceCommit":{"committedDate":"2024-08-20T14:50:00Z","message":"[Fleet]
Remove duplicative retries from client-side requests to APIs that depend
on EPR (#190722)\n\n## Summary\r\n\r\nRelates to
https://github.com/elastic/kibana/issues/136617\r\n\r\nFor APIs that
depend on Fleet connecting to Elastic Package Registry,\r\nFleet already
retries the connections to EPR on the server side. This\r\nresults in a
situation where, when EPR is unreachable, the requests is\r\nretried
several times on the server side, and then the request is\r\nretried
again on the client-side by react-query. This results in very\r\nlong
running API requests.\r\n\r\nSince the server-side retries generally
cover any kind of flakiness\r\nhere, disabling the retry logic on the
front-end seems sensible. ~I've\r\nalso reduced the number of retries on
the server side from 5 to 3 to\r\nhelp fail faster here.~ I walked back
the retry change after some test\r\nfailures, and I don't think it makes
a big enough impact to justify\r\nchanging.\r\n\r\n## To test\r\n\r\nSet
`xpack.fleet.registryUrl: 127.0.0.1:8080` with nothing running\r\n\r\n##
Before\r\n\r\nThe requests spin for a very long
time.\r\n\r\n\r\nhttps://github.com/user-attachments/assets/e4fd77ee-b36c-4965-9f71-e5b3e195f75e\r\n\r\n##
After\r\n\r\nThe requests stop spinning after a few seconds as the
retries
won't\r\nloop.\r\n\r\n\r\nhttps://github.com/user-attachments/assets/82adc595-1bc4-4269-8501-2eb83525ad15\r\n\r\ncc
@shahzad31","sha":"cf3149e983c5aec547e08cfa9202b68cd7115899","branchLabelMapping":{"^v8.16.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:fix","Team:Fleet","backport:prev-minor","v8.16.0"],"title":"[Fleet]
Remove duplicative retries from client-side requests to APIs that depend
on
EPR","number":190722,"url":"https://github.com/elastic/kibana/pull/190722","mergeCommit":{"message":"[Fleet]
Remove duplicative retries from client-side requests to APIs that depend
on EPR (#190722)\n\n## Summary\r\n\r\nRelates to
https://github.com/elastic/kibana/issues/136617\r\n\r\nFor APIs that
depend on Fleet connecting to Elastic Package Registry,\r\nFleet already
retries the connections to EPR on the server side. This\r\nresults in a
situation where, when EPR is unreachable, the requests is\r\nretried
several times on the server side, and then the request is\r\nretried
again on the client-side by react-query. This results in very\r\nlong
running API requests.\r\n\r\nSince the server-side retries generally
cover any kind of flakiness\r\nhere, disabling the retry logic on the
front-end seems sensible. ~I've\r\nalso reduced the number of retries on
the server side from 5 to 3 to\r\nhelp fail faster here.~ I walked back
the retry change after some test\r\nfailures, and I don't think it makes
a big enough impact to justify\r\nchanging.\r\n\r\n## To test\r\n\r\nSet
`xpack.fleet.registryUrl: 127.0.0.1:8080` with nothing running\r\n\r\n##
Before\r\n\r\nThe requests spin for a very long
time.\r\n\r\n\r\nhttps://github.com/user-attachments/assets/e4fd77ee-b36c-4965-9f71-e5b3e195f75e\r\n\r\n##
After\r\n\r\nThe requests stop spinning after a few seconds as the
retries
won't\r\nloop.\r\n\r\n\r\nhttps://github.com/user-attachments/assets/82adc595-1bc4-4269-8501-2eb83525ad15\r\n\r\ncc
@shahzad31","sha":"cf3149e983c5aec547e08cfa9202b68cd7115899"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/190722","number":190722,"mergeCommit":{"message":"[Fleet]
Remove duplicative retries from client-side requests to APIs that depend
on EPR (#190722)\n\n## Summary\r\n\r\nRelates to
https://github.com/elastic/kibana/issues/136617\r\n\r\nFor APIs that
depend on Fleet connecting to Elastic Package Registry,\r\nFleet already
retries the connections to EPR on the server side. This\r\nresults in a
situation where, when EPR is unreachable, the requests is\r\nretried
several times on the server side, and then the request is\r\nretried
again on the client-side by react-query. This results in very\r\nlong
running API requests.\r\n\r\nSince the server-side retries generally
cover any kind of flakiness\r\nhere, disabling the retry logic on the
front-end seems sensible. ~I've\r\nalso reduced the number of retries on
the server side from 5 to 3 to\r\nhelp fail faster here.~ I walked back
the retry change after some test\r\nfailures, and I don't think it makes
a big enough impact to justify\r\nchanging.\r\n\r\n## To test\r\n\r\nSet
`xpack.fleet.registryUrl: 127.0.0.1:8080` with nothing running\r\n\r\n##
Before\r\n\r\nThe requests spin for a very long
time.\r\n\r\n\r\nhttps://github.com/user-attachments/assets/e4fd77ee-b36c-4965-9f71-e5b3e195f75e\r\n\r\n##
After\r\n\r\nThe requests stop spinning after a few seconds as the
retries
won't\r\nloop.\r\n\r\n\r\nhttps://github.com/user-attachments/assets/82adc595-1bc4-4269-8501-2eb83525ad15\r\n\r\ncc
@shahzad31","sha":"cf3149e983c5aec547e08cfa9202b68cd7115899"}}]}]
BACKPORT-->

Co-authored-by: Kyle Pollich <kyle.pollich@elastic.co>
@criamico criamico self-assigned this Dec 2, 2024
@criamico
Copy link
Contributor

criamico commented Dec 2, 2024

@kpollich Took a look at this for ON week, I have a PR where I introduced the xpack.fleet.isAirGapped configuration key to allow installing bundled packages even in case of unreachable registry.

CAWilson94 pushed a commit to CAWilson94/kibana that referenced this issue Dec 12, 2024
…lastic#202435)

Closes elastic#136617
Closes elastic#167195

## Summary
[Spacetime] Improving Integrations experience on airgapped envs using
the existing `xpack.fleet.isAirGapped` configuration key:
- Loading integrations is now much faster and doesn't attempt to contact
the registry at all
- Installing an uninstalled bundled packages should now be possible
NOTE: Setting the `isAirGapped` skips the calls to registry altogether

### Testing
- In `kibana.dev.yml`:
- Make sure that APM and other bundled packages are not present in
preconfiguration
- Configure `xpack.fleet.registryUrl` to an unreacheable host, i.e.
`xpack.fleet.registryUrl: http://notworking`
  - Configure `xpack.fleet.isAirGapped: true`
- Copy the zip files from the EPR for some common bundled packages to
them in `them in kibana/x-pack/plugins/fleet/target/bundled_packages`
- Navigate to Integrations and verify that the page loads faster than in
the past (it would take long because of retries)
- Navigate to apm page or to`app/integrations/detail/apm-8.4.2/overview`
- Try to install it it should succeed correctly


### Checklist

- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
CAWilson94 pushed a commit to CAWilson94/kibana that referenced this issue Jan 13, 2025
…lastic#202435)

Closes elastic#136617
Closes elastic#167195

## Summary
[Spacetime] Improving Integrations experience on airgapped envs using
the existing `xpack.fleet.isAirGapped` configuration key:
- Loading integrations is now much faster and doesn't attempt to contact
the registry at all
- Installing an uninstalled bundled packages should now be possible
NOTE: Setting the `isAirGapped` skips the calls to registry altogether

### Testing
- In `kibana.dev.yml`:
- Make sure that APM and other bundled packages are not present in
preconfiguration
- Configure `xpack.fleet.registryUrl` to an unreacheable host, i.e.
`xpack.fleet.registryUrl: http://notworking`
  - Configure `xpack.fleet.isAirGapped: true`
- Copy the zip files from the EPR for some common bundled packages to
them in `them in kibana/x-pack/plugins/fleet/target/bundled_packages`
- Navigate to Integrations and verify that the page loads faster than in
the past (it would take long because of retries)
- Navigate to apm page or to`app/integrations/detail/apm-8.4.2/overview`
- Try to install it it should succeed correctly


### Checklist

- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Team:Fleet Team label for Observability Data Collection Fleet team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants