Skip to content

Conversation

@mergify
Copy link
Contributor

@mergify mergify bot commented Jun 24, 2025

What does this PR do?

This PR fixes a regression introduced by #6907, which updated the RPM/DEB preinstall script to stop the ElasticEndpoint service during agent upgrades to work around tamper protection restrictions. While effective in stopping the service, the original change restarted the endpoint before restarting the agent. This sequence causes most of the time endpoint to try and reconnect to elastic-agent but without any time guarantees when this is gonna be successful.

To address this, the PR:

  • Restart of the ElasticEndpoint service after the elastic-agent service has been restarted to guarantee that elastic-endpoint can connect to elastic-agent.
  • Enhances integration tests to:
    • Use locally built artifacts when testing same-version upgrades.
    • Improve error messages and fixture preparation robustness.

Why is it important?

Improper ordering of service restarts during DEB/RPM upgrades with endpoint tamper protection enabled was causing the endpoint to start independently of the agent, resulting in "always-retrying" and sporadic degraded operation. This fix ensures the services are brought up in the correct order to maintain endpoint health.

Checklist

  • I have read and understood the pull request guidelines of this project.
  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in ./changelog/fragments using the changelog tool
  • I have added an integration test or an E2E test

Disruptive User Impact

How to test this PR locally

mage integration:auth
STACK_PROVISIONER=stateful mage integration:single TestUpgradeAgentWithTamperProtectedEndpoint_RPM

Related issues


This is an automatic backport of pull request #8637 done by [Mergify](https://mergify.com).

…#8637)

* fix: use rpm from local build

(cherry picked from commit 249885f)

# Conflicts:
#	dev-tools/packaging/templates/linux/postinstall.sh.tmpl
#	testing/integration/endpoint_security_test.go
@mergify mergify bot added backport conflicts There is a conflict in the backported pull request labels Jun 24, 2025
@mergify mergify bot requested a review from a team as a code owner June 24, 2025 08:07
@mergify mergify bot added the conflicts There is a conflict in the backported pull request label Jun 24, 2025
@mergify mergify bot requested review from blakerouse and pkoutsovasilis and removed request for a team June 24, 2025 08:07
@mergify mergify bot added the backport label Jun 24, 2025
@mergify
Copy link
Contributor Author

mergify bot commented Jun 24, 2025

Cherry-pick of 249885f has failed:

On branch mergify/bp/8.19/pr-8637
Your branch is up to date with 'origin/8.19'.

You are currently cherry-picking commit 249885f9f.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   dev-tools/packaging/templates/linux/postinstall.sh.tmpl
	both modified:   testing/integration/endpoint_security_test.go

no changes added to commit (use "git add" and/or "git commit -a")

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

@github-actions github-actions bot added Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team skip-changelog labels Jun 24, 2025
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

Kaan Yalti and others added 3 commits June 24, 2025 11:21
…tion (#6907)

* Update pkg/testing/tools/tools.go

Co-authored-by: Paolo Chilà <paolo.chila@elastic.co>

* enhancement(6394): updated preinstall script, updated service to use uninstall token

* enhancmenet(6394): updated the preinstall script

* enchancement(6394): started adding integraiton tests

* enhancement(6394): updated fixture install, updated endpoint security tests

* enhancement(6394): cleaned up fixture_install, added function that exposes fixture's uninstall tokens, updated tests

* enhancement(6394): refactored test code so that I can use it with rpm

* enhancement(6394): added tests to assert that tamper protection works

* enhancement(6394): updated the endpoint testing tools, fixture install functions and the deb rpm upgrade tests

* enhancement(6394): added test logs, updated rpm installation to set agent socket path

* enhancement(6394): remove commented code

* enhancement(6394): remove print statements

* enhancement(6394): remove unnecessary comments, refactor unused function

* enhancement(6394): revert var name change

* enhancement(6394): added changelog

* enchancement(6394): update test logs, add non integrative config to deb installation

* enhancement(6394): updated the endpoint version comparison and assertion

* enhancement(6394): added log in tests

* enhancement(6394): resorted to using previous major instead of minor in upgrade test

* enhancement(6394): updated endpoint version function in the tests, updated function name in testing tools

* enhancement(6394): use previous minor, fix log

* enhancement(6394): added comment explaining motive behind simple install functions

* enhancement(6394): updated return in tools

* Update changelog/fragments/1740166208-allow-deb-rpm-upgrade-with-tamper-protected-endpoint.yaml

Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co>

* enhancement(6394): fixed function call in tests

* enhancement(6394): added systemctl start in postinstall, refactored preinstall and added condition to make same version installations work

* enhancement(6394): updated the preinstall and postinstall scripts to troubleshoot

* enhancement(6394): updated preinstall and postinstall script templates

- Updated preinstall to stop endpoint if it is an available service regardless of the version of endpoint that's install
- Updated postintall to start endpoint if the old endpoint version and the new version match.

* enhancement(6394): removed error exit from postinstall

* enhancement(6394): updated postinstall and preinstall templates

- Preinstall now does not use a state file. Recovery from failure start ElasticEndpoint if it is not running
- Preinstall does not stop endpoint if tamper protection is not enabled
- Postinstall does not print an error if service is still running

* enhancement(6394): removed debug logs

* enhancement(6394): removed unnecessary comment

* enhancement(6394): store uninstall token as local var, uninstall through the agent

* enhancement(6394): added setclient function

* enhancement(6394): added getInstallCommand and replaced SimpleInstall

* enhancement(6394): added test case for error recovery. removed unused fixture functions

* enhancement(6394): refactored tests, consolidated test scenarios into one function

* enhancement(6394): remove unnecessary test functions

* enhancement(6394): remove unused fixture function

* enhancement(6394): revert unwanted installDeb changes

* enhancement(6394): remove unwanted changes in testing tools

* enhancement(6394): remove unused function call

* enhancement(6394): replacing systemctl instead of adding new one to path

* enhancement(6394): update real systemctl path in mock systemctl script

* enhancement(6394): fix linting errors

* Update changelog/fragments/1740166208-allow-deb-rpm-upgrade-with-tamper-protected-endpoint.yaml

Co-authored-by: Paolo Chilà <paolo.chila@elastic.co>

* Update dev-tools/packaging/templates/linux/postinstall.sh.tmpl

Co-authored-by: Paolo Chilà <paolo.chila@elastic.co>

* Update pkg/testing/tools/tools.go

Co-authored-by: Paolo Chilà <paolo.chila@elastic.co>

* Update dev-tools/packaging/templates/linux/postinstall.sh.tmpl

Co-authored-by: Paolo Chilà <paolo.chila@elastic.co>

* Update dev-tools/packaging/templates/linux/postinstall.sh.tmpl

Co-authored-by: Paolo Chilà <paolo.chila@elastic.co>

* Update pkg/testing/tools/tools.go

Co-authored-by: Paolo Chilà <paolo.chila@elastic.co>

* enhancement(6394): updated print statement

* enhancement(6394): remove unnecessary command

* enhancement(6394): use addressFromPath and SetClient

* enhancement(6394): using service name, fixed indentation

* test(debug): add detailed logging to Fixture.SetClient and installDeb for agent client setup debugging

* Revert "test(debug): add detailed logging to Fixture.SetClient and installDeb for agent client setup debugging"

This reverts commit 390c561.

* enhancement(6394): renamed SetClient to SetDebRpmClient. Using hardcoded working dir as fixture working dir does not work for determining socket path

* enhancement(6394): consolidated same version upgrade and regular upgrdade test functions

* enhancement(6394): simplify preinstall script and enhance upgrade tests for tamper protection
- Removed unnecessary endpoint handling logic from preinstall script.
- Improved checks for service installation and status before upgrade.
- Updated upgrade test functions to handle stopping the endpoint service before upgrades.

* enhancement(6394): remove
mock systemctl script for tamper protection tests

* enhancement(6394): remove unused import

* enhancement(6394): fixed order of execution in preinstall

* enhancement(6394): added tests to make sure deb/rpm upgrades work when endpoint is not tamper protected

---------

Co-authored-by: Paolo Chilà <paolo.chila@elastic.co>
Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co>
(cherry picked from commit 8a6531f)

# Conflicts:
#	dev-tools/packaging/templates/linux/preinstall.sh.tmpl

# Conflicts:
#	dev-tools/packaging/templates/linux/postinstall.sh.tmpl
#	testing/integration/endpoint_security_test.go
@elastic-sonarqube
Copy link

@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

History

cc @pkoutsovasilis

@pkoutsovasilis pkoutsovasilis merged commit eeef6f1 into 8.19 Jun 24, 2025
19 checks passed
@pkoutsovasilis pkoutsovasilis deleted the mergify/bp/8.19/pr-8637 branch June 24, 2025 12:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport conflicts There is a conflict in the backported pull request skip-changelog Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants