-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(MODULES-7698) Fix OSX agent upgrades #320
Conversation
I've tested this out on both Solaris 10 and OSX 10.12. Upgrades still work properly on Solaris 10, while on OSX 10.12, the race condition is no longer there and upgrade paths of the form
Upgrades were tested by installing an older agent (e.g. 2018.1.3) and upgrading to a newer one (e.g. 2018.1.4). The |
manifests/install.pp
Outdated
$install_script = "osx_install.sh.erb" | ||
|
||
contain puppet_agent::install::remove_packages | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dunno if it'd be useful to refactor this and the Sol 10. stuff into a defined type? Something like perform_agent_upgrade { <platform> }
? I chose not to do it here since the code seems simple enough, and b/c our Sol 10 agents use ctrun to execute the script (here: https://github.com/puppetlabs/puppetlabs-puppet_agent/blob/master/manifests/install.pp#L99) -- I don't want to introduce a weird param. like $script_wrapper when this code is only used twice.. I don't know why ctrun is used, as from what I understand that code basically runs the script in the background. We don't use any of ctrun's event reporting features. Maybe that could be refactored to just execute the script in the background? Or we could use a Mac-OS' equivalent of ctrun? I couldn't find any examples of the latter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's worth factoring out yet.
As for ctrun, I have no idea why we use that instead of just backgrounding the script.
CI passed. This should be ready for merge now. |
CLA signed by all contributors. |
@@ -55,6 +55,7 @@ | |||
$_unzipped_package_name = regsubst($package_file_name, '\.gz$', '') | |||
$adminfile = '/opt/puppetlabs/packages/solaris-noask' | |||
$sourcefile = "/opt/puppetlabs/packages/${_unzipped_package_name}" | |||
$install_script = 'solaris_install.sh.erb' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you clarify why solaris stuff is in an OSX PR? Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The individual commits should clarify that one -- see the messages in the first two commits. Main reason is b/c some refactoring needed to be done to make the main install script generic for non-upgradeable platforms like Solaris and OSX.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Would be nice to have a more detailed description in the first commit about why we need to use ERB instead of EPP |
We do this to prepare for the work in MODULES-7698. Specifically, MODULES-7698 requires us to upgrade our OSX agents in the same way we upgrade our Solaris 10 agents, which is by running the upgrade in the background after the Puppet run is finished. The OSX upgrade script will have a similar structure to the Solaris 10 upgrade script, with the only difference being how the agent is installed. Thus, we would like to generalize some of this structure in a separate commit. Our generalized script would have a spot for us to insert our platform-specific agent installation code. "Inserting" our platform-specific agent installation code really means rendering the corresponding script template inside our generic installation script. There's not an easy (or clean) way to support rendering another EPP script inside an EPP script. However, it is easy to do this with ERB templates. Thus, this commit converts our solaris_install.sh.epp script to an ERB script so that we are able to generalize the common structure between our OSX and Solaris upgrade scripts in future commits.
This script is a generic agent installation script used as a template for upgrading the agent on non-upgradeable platforms like Solaris 10 and OSX. We refactor our Solaris 10 agent upgrades using this template here. We will refactor our OSX agent upgrades using this template in a later commit as part of MODULES-7698.
@branan Updated the PR with a more detailed description in the first commit. |
templates/osx_install.sh.erb
Outdated
hexdump -n 2 -e '/2 "%u"' /dev/urandom | ||
} | ||
|
||
mountpoint="$(mktemp -d -t $(random_hexdump))" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be able to rely on mktemp to generate a securely random filename here - $(mktemp -d)
is probably just fine on its own?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. For context, I ported this code over from https://github.com/puppetlabs/puppetlabs-puppet_agent/blob/master/tasks/install_shell.sh#L427
Updated the PR to remove the use of the |
Previously, OSX agent upgrades tried installing the new agent via. a Puppet package resource using the pkgdmg provider. Unfortunately, this does not work for several reasons. * pkgdmg is unversioned. This means that whenever it installs a package, it outputs a /var/db/.puppet_pkgdmg_installed_<package_name> file indicating that the package is installed. For our case, we'd have a /var/db/.puppet_pkgdmg_installed_puppet_agent file. This means that upgrades of the form "V1 => V2 => V3" (e.g. 2018.1.3 => 2018.1.4 => 2018.5) fail at the "V2 => V3" path because that file would still exist from the "V1 => V2" upgrade. Thus, pkgdmg would be tricked into thinking that the V3 agent was installed since it queries that file to see if the package is already installed on the system. * The installer stops the Puppet service in the middle of the installation which interrupts the current Puppet run if that run was triggered by the Puppet service. The installer proceeds to restart the service in the post-installation step. This can trigger a Puppet run while the installer's still running, which leads to weird race conditions like an infinite loop or Puppet installing the upgraded agent several times (typically twice) in successful cases. This commit takes care of these issues by following the same pattern we use for our Solaris 10 agent upgrades. Specifically, our Puppet run performs the upgrade in the background. The background process waits for the current Puppet run to exit before initiating the upgrade.
Squashed the commit. This should be ready for merge now. |
No description provided.