[Fleet][EPM] Don't roll back on saved objects conflict errors. #85131

skh · 2020-12-07T12:29:08Z

Summary

This refines the implementation of #84190 and implements #84651 . See also #84656 for a bit of discussion.

This changes the behavior of _installPackage() so that

when a concurrent installation is detected, a ConcurrentInstallOperationError is thrown (instead of returning a list of installed assets which may or may not be complete)
when a version conflict on a saved object write operation is thrown in any of the install*() methods called by _installPackage(), this is also wrapped in a ConcurrentInstallOperationError
all other errors are thrown as before
higher up in the call chain, ConcurrentInstallOperationError will not trigger a rollback. This fixes the bug that occurs when a second installation/upgrade operation aborts because of a saved object version conflict, and therefore rolls back the installation that a first installation operation just completed successfully, potentially resulting in follow-up errors and a broken installation
ConcurrentInstallOperationError will cause the handler to return a 409 HTTP response with a message stating on which package the concurrent installation was detected, and that the operation was aborted.

This is still a rather optimistic way of handling this situation: when a concurrent installation is detected, the running installation is aborted and no attempts are made to clean up after it. This is possible because the install*() methods (installing kibana assets, pipelines, templates etc.) are idempotent. Indeed it is still perfectly possible that two parallel installations run successfully, installing everything twice, or that they only run into the saved object conflict at the very end, after almost everything was installed twice.

This may have effects on other users of the install package code flow, namely endpoint security. (cc @jonathan-buttner )

How to test this

Try to get the installed package into a broken state. To do that, try to trigger a race condition by installing the same package several times at once, and observe if the race condition is handled correctly. https://gist.github.com/skh/cc695952031c9e349874b898c7066e42 may be helpful for this -- I had to set WAIT_TIME_REINSTALL in that script to 0 and run it a few times.
Try to break it in any other way.

elasticmachine · 2020-12-07T13:20:47Z

Pinging @elastic/ingest-management (Feature:EPM)

x-pack/plugins/fleet/server/services/epm/packages/_install_package.ts

skh · 2020-12-07T13:28:29Z

@elasticmachine merge upstream

x-pack/plugins/fleet/server/services/epm/packages/_install_package.ts

kibanamachine · 2021-01-18T11:33:59Z

⏳ Build in-progress, with failures

continuous-integration/kibana-ci/pull-request
Commit: 69d125c
This comment will update when the build is complete

Failed CI Steps

History

💚 Build #94004 succeeded 69d125c
💚 Build #92330 succeeded 846781cf735fdf641803a536e8da96dbdbc41224
💚 Build #92321 succeeded a66d2c3527df6bffe921bdfd1bae1d65a8560c32

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

skh self-assigned this Dec 7, 2020

skh added Feature:EPM Fleet team's Elastic Package Manager (aka Integrations) project release_note:skip Skip the PR/issue when compiling release notes Team:Fleet Team label for Observability Data Collection Fleet team v7.11.0 v8.0.0 labels Dec 7, 2020

skh marked this pull request as ready for review December 7, 2020 13:20

skh requested a review from a team December 7, 2020 13:20

skh changed the title ~~Don't rollback on saved objects conflict errors.~~ [Fleet][EPM] Don't rollback on saved objects conflict errors. Dec 7, 2020

skh changed the title ~~[Fleet][EPM] Don't rollback on saved objects conflict errors.~~ [Fleet][EPM] Don't roll back on saved objects conflict errors. Dec 7, 2020

skh commented Dec 7, 2020

View reviewed changes

x-pack/plugins/fleet/server/services/epm/packages/_install_package.ts Show resolved Hide resolved

skh requested review from neptunian and jonathan-buttner December 7, 2020 13:39

jonathan-buttner approved these changes Dec 7, 2020

View reviewed changes

neptunian requested changes Dec 7, 2020

View reviewed changes

x-pack/plugins/fleet/server/services/epm/packages/_install_package.ts Show resolved Hide resolved

neptunian self-requested a review December 7, 2020 18:58

neptunian approved these changes Dec 8, 2020

View reviewed changes

skh mentioned this pull request Dec 14, 2020

[Fleet] Handle saved-object conflict in /api/fleet/agents/{agent-id}/reassign endpoint #85775

Closed

Don't rollback on saved objects conflict errors.

69d125c

skh force-pushed the 84651-check-for-saved-object-version-conflict branch from 846781c to 69d125c Compare December 14, 2020 14:00

skh merged commit 1b3a1bb into elastic:master Dec 14, 2020

skh mentioned this pull request Dec 14, 2020

[7.x] Don't rollback on saved objects conflict errors. (#85131) #85806

Merged

skh added a commit that referenced this pull request Dec 14, 2020

Don't rollback on saved objects conflict errors. (#85131) (#85806)

e28dd2a

skh deleted the 84651-check-for-saved-object-version-conflict branch December 14, 2020 21:29

skh mentioned this pull request Jan 18, 2021

[BUG] Concurrent installs of Endpoint Package can cause errors in Fleet setup #88249

Open

kpollich mentioned this pull request Dec 11, 2023

[Fleet] Saved objects conflicts shouldn't throw Concurrent Installation error #171986

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fleet][EPM] Don't roll back on saved objects conflict errors. #85131

[Fleet][EPM] Don't roll back on saved objects conflict errors. #85131

skh commented Dec 7, 2020 •

edited

Loading

elasticmachine commented Dec 7, 2020

skh commented Dec 7, 2020

kibanamachine commented Jan 18, 2021 •

edited

Loading

[Fleet][EPM] Don't roll back on saved objects conflict errors. #85131

[Fleet][EPM] Don't roll back on saved objects conflict errors. #85131

Conversation

skh commented Dec 7, 2020 • edited Loading

Summary

How to test this

elasticmachine commented Dec 7, 2020

skh commented Dec 7, 2020

kibanamachine commented Jan 18, 2021 • edited Loading

⏳ Build in-progress, with failures

Failed CI Steps

History

skh commented Dec 7, 2020 •

edited

Loading

kibanamachine commented Jan 18, 2021 •

edited

Loading