Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(MODULES-7698) Fix OSX agent upgrades #320

Merged
merged 3 commits into from
Aug 24, 2018

Conversation

ekinanp
Copy link
Contributor

@ekinanp ekinanp commented Aug 23, 2018

No description provided.

@ekinanp
Copy link
Contributor Author

ekinanp commented Aug 23, 2018

I've tested this out on both Solaris 10 and OSX 10.12. Upgrades still work properly on Solaris 10, while on OSX 10.12, the race condition is no longer there and upgrade paths of the form V1 => V2 => V3 work. My test set-up was a PE master with three PE versions installed:

  • 2018.1.3
  • 2018.1.4
  • PE 2019.0 dev. build

Upgrades were tested by installing an older agent (e.g. 2018.1.3) and upgrading to a newer one (e.g. 2018.1.4). The V1 => V2 => V3 path was done by upgrading 2018.1.3 => 2018.1.4 => PE 2019.0 dev. build. Note that I used my version of the puppet-agent module and placed it in the /etc/puppetlabs/code/environments/production/modules directory. Nodes were pointed to this manifest in the corresponding site.pp file.

$install_script = "osx_install.sh.erb"

contain puppet_agent::install::remove_packages

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dunno if it'd be useful to refactor this and the Sol 10. stuff into a defined type? Something like perform_agent_upgrade { <platform> }? I chose not to do it here since the code seems simple enough, and b/c our Sol 10 agents use ctrun to execute the script (here: https://github.com/puppetlabs/puppetlabs-puppet_agent/blob/master/manifests/install.pp#L99) -- I don't want to introduce a weird param. like $script_wrapper when this code is only used twice.. I don't know why ctrun is used, as from what I understand that code basically runs the script in the background. We don't use any of ctrun's event reporting features. Maybe that could be refactored to just execute the script in the background? Or we could use a Mac-OS' equivalent of ctrun? I couldn't find any examples of the latter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's worth factoring out yet.

As for ctrun, I have no idea why we use that instead of just backgrounding the script.

@ekinanp
Copy link
Contributor Author

ekinanp commented Aug 24, 2018

CI passed. This should be ready for merge now.

@puppetcla
Copy link

CLA signed by all contributors.

@@ -55,6 +55,7 @@
$_unzipped_package_name = regsubst($package_file_name, '\.gz$', '')
$adminfile = '/opt/puppetlabs/packages/solaris-noask'
$sourcefile = "/opt/puppetlabs/packages/${_unzipped_package_name}"
$install_script = 'solaris_install.sh.erb'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you clarify why solaris stuff is in an OSX PR? Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The individual commits should clarify that one -- see the messages in the first two commits. Main reason is b/c some refactoring needed to be done to make the main install script generic for non-upgradeable platforms like Solaris and OSX.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@branan
Copy link
Contributor

branan commented Aug 24, 2018

Would be nice to have a more detailed description in the first commit about why we need to use ERB instead of EPP

We do this to prepare for the work in MODULES-7698. Specifically,
MODULES-7698 requires us to upgrade our OSX agents in the same way we
upgrade our Solaris 10 agents, which is by running the upgrade in the
background after the Puppet run is finished. The OSX upgrade script will
have a similar structure to the Solaris 10 upgrade script, with the only
difference being how the agent is installed. Thus, we would like to
generalize some of this structure in a separate commit. Our generalized
script would have a spot for us to insert our platform-specific agent
installation code. "Inserting" our platform-specific agent installation
code really means rendering the corresponding script template inside our
generic installation script. There's not an easy (or clean) way to
support rendering another EPP script inside an EPP script. However, it
is easy to do this with ERB templates. Thus, this commit converts our
solaris_install.sh.epp script to an ERB script so that we are able to
generalize the common structure between our OSX and Solaris upgrade
scripts in future commits.
This script is a generic agent installation script used as a template
for upgrading the agent on non-upgradeable platforms like Solaris 10 and OSX.
We refactor our Solaris 10 agent upgrades using this template here. We
will refactor our OSX agent upgrades using this template in a later
commit as part of MODULES-7698.
@ekinanp
Copy link
Contributor Author

ekinanp commented Aug 24, 2018

@branan Updated the PR with a more detailed description in the first commit.

hexdump -n 2 -e '/2 "%u"' /dev/urandom
}

mountpoint="$(mktemp -d -t $(random_hexdump))"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to rely on mktemp to generate a securely random filename here - $(mktemp -d) is probably just fine on its own?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ekinanp
Copy link
Contributor Author

ekinanp commented Aug 24, 2018

Updated the PR to remove the use of the random_hexdump function, instead using only mktemp -d in the OSX install code.

Previously, OSX agent upgrades tried installing the new agent via. a
Puppet package resource using the pkgdmg provider. Unfortunately, this
does not work for several reasons.

* pkgdmg is unversioned. This means that whenever it installs a package,
it outputs a /var/db/.puppet_pkgdmg_installed_<package_name> file
indicating that the package is installed. For our case, we'd have a
/var/db/.puppet_pkgdmg_installed_puppet_agent file. This means that
upgrades of the form "V1 => V2 => V3" (e.g. 2018.1.3 => 2018.1.4 =>
2018.5) fail at the "V2 => V3" path because that file would still exist
from the "V1 => V2" upgrade. Thus, pkgdmg would be tricked into thinking
that the V3 agent was installed since it queries that file to see if the
package is already installed on the system.

* The installer stops the Puppet service in the middle of the
installation which interrupts the current Puppet run if that run was
triggered by the Puppet service. The installer proceeds to restart
the service in the post-installation step. This can trigger a Puppet run
while the installer's still running, which leads to weird race conditions
like an infinite loop or Puppet installing the upgraded agent several times
(typically twice) in successful cases.

This commit takes care of these issues by following the same pattern we
use for our Solaris 10 agent upgrades. Specifically, our Puppet run
performs the upgrade in the background. The background process waits for
the current Puppet run to exit before initiating the upgrade.
@ekinanp
Copy link
Contributor Author

ekinanp commented Aug 24, 2018

Squashed the commit. This should be ready for merge now.

@speedofdark speedofdark merged commit 47269f1 into puppetlabs:1.x Aug 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants