Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Vendor tools on existing installations (e.g. VMWare open-vm.tools) #21

Closed
t-lo opened this issue Feb 19, 2020 · 28 comments
Closed
Assignees
Labels

Comments

@t-lo
Copy link
Member

t-lo commented Feb 19, 2020

The Flatcar Container Linux VMWare image ships with VMWare guest tools. However, when updating Flatcar Container Linux, the guest tools are not updated even if a newer version is shipped with the Flatcar update.

Note that VMWare maintains a knowledge base article on this issue that includes a workaround: https://kb.vmware.com/s/article/2142569

@t-lo t-lo added kind/bug Something isn't working channel/stable Issue concerns the Stable channel. labels Feb 19, 2020
@pothos
Copy link
Member

pothos commented Mar 11, 2020

I tried to reproduce it but after two updates the /usr/share/oem files were still there and the vmtoolsd.service running. Which files disappeared in which scenario?

@t-lo t-lo assigned t-lo and pothos Mar 11, 2020
@t-lo t-lo changed the title VMWare guest OS tools disappear when FCL updates itself VMWare guest OS tools not updated on FCL update Mar 11, 2020
@t-lo
Copy link
Member Author

t-lo commented Mar 11, 2020

I've updated the issue title and description as I received more information from the original reporter. Please pardon the confusion:

My recollection of the open-vm-tools problem is the following… If a customer initially installs certain version of CoreOS (that includes open-vm-tools version A) and performs one or more upgrades of CoreOS to a future version of CoreOS (that includes open-vm-tools B), open-vm-tools is never updated and remains at version A. It is desirable for open-vm-tools to get updated through OS vendors maintenance process.

@pothos
Copy link
Member

pothos commented Mar 11, 2020

OEM partitions are not updated as far as I know.

@t-lo
Copy link
Member Author

t-lo commented Mar 11, 2020

We might want to turn this into a feature request then.

@pothos
Copy link
Member

pothos commented Mar 12, 2020

I would rather look into moving OEM partition contents to a container image run by a service we ship in /usr so that we can use the current update mechanism to bump container image versions.

@t-lo t-lo added kind/feature A feature request channel/alpha Issue concerns the Alpha channel. channel/beta Issue concerns the Beta channel. channel/edge Issues concerning the EDGE channel of Flatcar Container Linux. and removed kind/bug Something isn't working labels Mar 12, 2020
@t-lo t-lo changed the title VMWare guest OS tools not updated on FCL update Update Vendor tools on existing installations (e.g. VMWare open-vm.tools) May 14, 2020
@mritalian
Copy link

FYI, with auto-updating images (the default) and the recent bump from kernel 4 to 5, this breaks vmtoolsd on 100% of systems updating from kernel 4 to 5.

As an immediate response it would be cool if you might provide an easy way (easier than vmware silly solution) to update the vmtoolsd, or you will get bombarded with issues on github.

And moving forward, it seems a bad idea to lock the bundled oem tools in time to the point in time when the image was first deployed, while the OS continues to move forward. It 100% guarantees this kind of thing will happen over and over again.

Alas I can't help code this feature, but I'm available to guinea pig anything you come up.

@ahrkrak
Copy link
Contributor

ahrkrak commented Sep 22, 2020

Yes, indeed it's a bad idea - that's why we have this issue open! We're working on it, but it's not a simple one-line fix and we want to do it right.

@mritalian
Copy link

Here is oem partition from 2605.5.0 clean install in case someone is totally hosed and cannot follow vmware suggestion.

oem-2605.5.0.tgz.zip

Caveat: security-wise it is an abysimally bad idea to download binaries that run as root from arbitrary sources (i.e. me)

@goochjj
Copy link

goochjj commented Sep 24, 2020

I didn't have this problem because internally, I generate a (private) docker container that contains new tools, and on execution rsync's the new tools into /usr/share/oem. I've been running 11.1.0 for quite some time.

A similar technique could be used with an image on Docker hub... or by adding a separate image to the update servers. (i.e. add oem-2605.5.0.tar.gz files with appropriate .sig files to the channel release sites and provide an updater tool, integrated with update-engine or otherwise)

@goochjj
Copy link

goochjj commented Sep 24, 2020

Also interesting to note the version I compiled doesn't depend on libssl at all.. :-D

@neilmayhew
Copy link

neilmayhew commented Nov 9, 2020

The workaround documented by VMWare involves creating a new VM. This isn't a viable option for me, so I wrote a script that can be run on an existing VM to upgrade it. The script downloads the current stable VMWare Flatcar image and extracts the OEM partition from it. It then copies the files across to the running OEM partition and restarts the daemons. If you specify a tar filename on the command line it will reuse an existing tarfile if it exists and leave the tarfile in place afterwards so you can reuse it on other machines.

[Edit: use the updated version as discussed below]

#!/usr/bin/env bash

# Update a Flatcar installation on VMWare to use the latest OEM content

set -ex
shopt -s extglob

OEMCONTENT=oem-vmware.tgz
KEEPCONTENT=

if [ -n "$1" ]
then
  OEMCONTENT=$1
  KEEPCONTENT=yes
fi

# Cache sudo credentials
sudo true

if [ ! -f "$OEMCONTENT" ]
then
  # Fetch the release-signing public key
  KEYID=782B3BC9F10CF638A5DCF5105B2910CBFCBEAB91
  KEYSERVER=pool.sks-keyservers.net
  gpg --keyserver $KEYSERVER --recv-key $KEYID

  # Download the current stable VMWare Flatcar release
  IMGNAME=flatcar_production_vmware_raw_image.bin
  wget -N https://stable.release.flatcar-linux.net/amd64-usr/current/${IMGNAME}.bz2{,.sig}
  gpg --verify ${IMGNAME}.bz2{.sig,}
  bunzip2 -k ${IMGNAME}.bz2

  # Mount the OEM image partition via loopback
  MNT=$(mktemp -d) && trap 'rmdir "$MNT"' 0
  LOOPDEV=$(sudo losetup -f --show -P ${IMGNAME})
  sudo mount -r "${LOOPDEV}p6" "$MNT"

  # Save the content
  tar -cvzf "$OEMCONTENT" --exclude=lost+found -C "$MNT" .

  # Unmount the OEM image partition
  sudo umount "$MNT"
  sudo losetup -d "${LOOPDEV}"

  # Remove the downloaded image files
  rm -f ${IMGNAME}{,.bz2{.sig,}}
fi

# Stop the daemons
(cd /usr/share/oem/units && sudo systemctl stop -- *)

# Remove the exiting content
sudo rm -rf /usr/share/oem/!(lost+found)

# Install the new content
sudo tar -xf "$OEMCONTENT" -C /usr/share/oem
[ -n "$KEEPCONTENT" ] || rm -f "$OEMCONTENT"

# Restart the daemons
(cd /usr/share/oem/units && sudo systemctl start -- *)

set +x

# Inform the user
echo "New OEM content was installed and services were restarted"

@pothos
Copy link
Member

pothos commented Nov 10, 2020

Thanks for sharing! You also have to add a step to copy the new vmtoolsd.service file to /etc/systemd/system/vmtoolsd.service and issue sudo systemctl daemon-reload before restarting.

@neilmayhew
Copy link

You also have to add a step to copy the new vmtoolsd.service file to /etc/systemd/system/vmtoolsd.service and issue sudo systemctl daemon-reload before restarting.

Good point. That did complicate things a little, but I've updated and tested the script, and put it in a gist.

I realise that most people are running ephemeral instances and will just recreate their VMs, but hopefully this still helps some.

@neilmayhew
Copy link

The recent new alpha release (2697.0.0) introduces another new version of open-vm-tools so the script may be needed again.

@t-lo
Copy link
Member Author

t-lo commented Dec 4, 2020

I did a spike to better understand the major challenges left; WIP PRs are

Major blockers left:

  • integrate the above script into the vendor tools update mechanism to also update systemd units etc.
  • release process to produce signed vendor update tarballs
  • vendor update distribution mechanism / URLs returned by update server

@nchowinf
Copy link

nchowinf commented Mar 2, 2021

You also have to add a step to copy the new vmtoolsd.service file to /etc/systemd/system/vmtoolsd.service and issue sudo systemctl daemon-reload before restarting.

Good point. That did complicate things a little, but I've updated and tested the script, and put it in a gist.

I realise that most people are running ephemeral instances and will just recreate their VMs, but hopefully this still helps some.

@neilmayhew I just wanted to give you a big shout out for this. Thank you very much for saving our asses.

@neilmayhew
Copy link

I used the script again myself today, to bring the OEM directories in line with the new Flatcar stable release that just arrived (2765.2.0). The system wasn't degraded like it was last time, but I definitely want to keep the open-vm-tools up to date. (It's moved from 11.1.5 to 11.2.5.)

@sayanchowdhury sayanchowdhury added platform/VMWare and removed channel/alpha Issue concerns the Alpha channel. channel/beta Issue concerns the Beta channel. channel/edge Issues concerning the EDGE channel of Flatcar Container Linux. channel/stable Issue concerns the Stable channel. labels Jul 2, 2021
@bignay2000
Copy link

bignay2000 commented Jul 16, 2023

You also have to add a step to copy the new vmtoolsd.service file to /etc/systemd/system/vmtoolsd.service and issue sudo systemctl daemon-reload before restarting.

Good point. That did complicate things a little, but I've updated and tested the script, and put it in a gist.

I realise that most people are running ephemeral instances and will just recreate their VMs, but hopefully this still helps some.

@neilmayhew

Getting an error when I try to update.

./update-oem-vmware.sh
+ shopt -s extglob nullglob
+ OEMCONTENT=oem-vmware.tgz
+ KEEPCONTENT=
+ '[' -n '' ']'
+ sudo true
+ '[' '!' -f oem-vmware.tgz ']'
+ KEYID=782B3BC9F10CF638A5DCF5105B2910CBFCBEAB91
+ KEYSERVER=pool.sks-keyservers.net
+ gpg --keyserver pool.sks-keyservers.net --recv-key 782B3BC9F10CF638A5DCF5105B2910CBFCBEAB91
gpg: keyserver receive failed: Server indicated a failure

My ESXi 8 VM:

cat /etc/os-release
NAME="Flatcar Container Linux by Kinvolk"
ID=flatcar
ID_LIKE=coreos
VERSION=3510.2.4
VERSION_ID=3510.2.4
BUILD_ID=2023-07-04-1508
SYSEXT_LEVEL=1.0
PRETTY_NAME="Flatcar Container Linux by Kinvolk 3510.2.4 (Oklo)"
ANSI_COLOR="38;5;75"
HOME_URL="https://flatcar.org/"
BUG_REPORT_URL="https://issues.flatcar.org"
FLATCAR_BOARD="amd64-usr"
CPE_NAME="cpe:2.3:o:flatcar-linux:flatcar_linux:3510.2.4:*:*:*:*:*:*:*"

@neilmayhew
Copy link

Unfortunately, things keep changing with the keyservers. I've had to change my personal GPG config several times since I wrote this gist.

It will work with keyserver.ubuntu.com:

$ gpg --keyserver keyserver.ubuntu.com --recv-key 782B3BC9F10CF638A5DCF5105B2910CBFCBEAB91
gpg: key E25D9AED0593B34A: "Flatcar Buildbot (Official Builds) <buildbot@flatcar-linux.org>" not changed
gpg: Total number processed: 1
gpg:              unchanged: 1

I'll update the gist.

@neilmayhew
Copy link

It looks like the image is being signed with a different key from before (E9426D8B67E35DF476BD048185F7C8868837E271 now vs 782B3BC9F10CF638A5DCF5105B2910CBFCBEAB91 previously. Unfortunately, this is different from the official one published on the Flatcar web site, and it also can't be fetched from keyserver.ubuntu.com. I'm not sure what to do at this point.

@neilmayhew
Copy link

For now, you could just comment out the verification on line 33

@neilmayhew
Copy link

I downloaded and imported the key using the instructions on the web page, and it has two new subkeys, one of which is the one used to sign the image. Verifying the image now works. However, I don't see the point of using curl to fetch a key from a web site since that's just as likely to be compromised as the image itself. (It's OK if you're going to download the key once and then verify images a lot of times in the future, but that's not the situation here.) Flatcar needs to update the copy of the key on the keyservers, so that it includes the subkeys.

@bignay2000
Copy link

For now, you could just comment out the verification on line 33

This worked, successfully update VMWare Tools from open-vm-tools from 11.3.5 to 12.1.5 (ESXi Host Client Dashboard shows new version)

Thank you

@jepio
Copy link
Member

jepio commented Jul 17, 2023

I downloaded and imported the key using the instructions on the web page, and it has two new subkeys, one of which is the one used to sign the image.

Older images are signed with the older key, newer ones with the new one. We now allow for some overlap and issue the new one before the old one fully expires so that there is less chance of us ending with images not being able to verify updates.

However, I don't see the point of using curl to fetch a key from a web site since that's just as likely to be compromised as the image itself. (It's OK if you're going to download the key once and then verify images a lot of times in the future, but that's not the situation here.)

That's exactly what the idea was: that if you need to verify it, you prefetch it (out of band) and use that to verify. We do have the key inside the image in the /usr/bin/flatcar-install script that checks image signatures before installation.

Flatcar needs to update the copy of the key on the keyservers, so that it includes the subkeys.

From what I heard keyservers are not a reliable way to distribute gpg keys in this day and age. That being said: I just pushed the key to the ubuntu keyserver (anyone is able to do that).

@neilmayhew
Copy link

Thanks for pushing the key. I guess I could have done that myself, using the key from the web page!

I'll respond to your comments about the security aspects in #1114.

@pothos pothos added this to the OEM updates for all images milestone Jul 20, 2023
@pothos
Copy link
Member

pothos commented Sep 11, 2023

The update mechanism will be available once the VMWare setup is reworked. That rework is tracked in #1144

We can leave this issue here open for visibility of the workaround script until the rework is done.

@pothos
Copy link
Member

pothos commented Sep 25, 2023

The rework is done and the next Alpha will have A/B updated vmware tools: flatcar/scripts#1146

The migration will only happen after both A/B partitions are on the version that require the OEM systemd-sysext image.

@pothos pothos closed this as completed Sep 25, 2023
@neilmayhew
Copy link

That's great news! I look forward to seeing the results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

10 participants