Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: Add /.coreos-aleph-version.json to target #768

Merged
merged 1 commit into from
Sep 24, 2019

Conversation

cgwalters
Copy link
Member

It will be very useful in the future to be able to more rigorously
know the build that a given machine started from. For example,
small tweaks like chattr +i /sysroot are things that won't
happen for in-place updates.

If we decide to introduce a mechanism (e.g. systemd
unit) that performs those changes even for old in-place installs,
it could be useful to know exactly what the starting state was.

I chose /boot since it's a relatively fixed location which
people generally won't wipe and replace without also erasing
everything else.

@cgwalters
Copy link
Member Author

Though hmm, probably /sysroot is better for this since then it'll be covered by the immutable bit...

@cgwalters
Copy link
Member Author

OK reworked this to go in /, include the ostree commit just for extra reference, and for that it's now JSON so it's extensible.

@lucab
Copy link
Contributor

lucab commented Sep 21, 2019

Strong plus on this, overall. Some comments on the approach though:

  • on CL this is historically called aleph-version. I would prefer sticking to that naming as it is both more explicit and not coupled to the build-tool
  • as per Make aleph-version more accessible bugs#2232, this used to be under /var and handled by the update-machinery on the first boot.

I'm not sure about all the pros and cons, but I did like the runtime approach. I think it would also play better if/when we support blowing the FS.

@cgwalters
Copy link
Member Author

on CL this is historically called aleph-version. I would prefer sticking to that naming as it is both more explicit and not coupled to the build-tool

OK, done.

I'm not sure about all the pros and cons, but I did like the runtime approach. I think it would also play better if/when we support blowing the FS.

Hmm. This to me is about knowing the true initial state; the original pristine disk image. If the user uses Ignition to repartition on boot, this would be carried over but not changed - we'd know the original filesystem from the version, and we'd know the new filesystem from their Ignition config.
So having it written statically seems right to me. (If we wrote it dynamically we'd need a source for it, and where would that be?)

@cgwalters cgwalters changed the title build: Add /boot/.coreos-assembler-buildid to target build: Add /.coreos-aleph-version.json to target Sep 23, 2019
@@ -173,7 +173,7 @@ if [ -z "${use_anaconda}" ]; then
ref_arg=${commit}
fi

runvm -drive "if=virtio,id=target,format=${image_format},file=${path}.tmp" -- /usr/lib/coreos-assembler/create_disk.sh /dev/vda "$ostree_repo" "${ref_arg}" "${ostree_remote}" /usr/lib/coreos-assembler/grub.cfg "$name" "${save_var_subdirs}" "\"$kargs\""
runvm -drive "if=virtio,id=target,format=${image_format},file=${path}.tmp" -- /usr/lib/coreos-assembler/create_disk.sh /dev/vda "${img}" "$ostree_repo" "${ref_arg}" "${ostree_remote}" /usr/lib/coreos-assembler/grub.cfg "$name" "${save_var_subdirs}" "\"$kargs\""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we be passing ${build} here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a "why not both?" situation - the imageid carries the platform and OS name too; it's completely unambiguous. But I can see the pure buildid being useful too so people don't have to parse the imageid. Added both.

cat > rootfs/.coreos-aleph-version.json << EOF
{
"build": "${buildid}",
"ostree-commit": "${ostree_commit}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also add ostree-ref here? Since that also encodes the stream they started on (esp. since build IDs are not unique per stream).

@jlebon
Copy link
Member

jlebon commented Sep 23, 2019

Can also probably add a Closes footer for coreos/fedora-coreos-tracker#170.

Copy link
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Will leave open for a bit for folks coming in from coreos/fedora-coreos-tracker#170.

@lucab
Copy link
Contributor

lucab commented Sep 23, 2019

I think that all these details are part of the commit metadata, so we could add logic somewhere (rpm-ostree / pivot / zincati) to dump them to disk on first-boot (i.e. if they are not on disk already).
That would provide a uniform public interface to answer the question "from which OS version did this node start?", even if the underlying disk/FS is replaced via Ignition.

@cgwalters
Copy link
Member Author

I think that all these details are part of the commit metadata,

Mmmm, no, the buildid isn't because coreos-assembler currently supports generating new images without rebuilding the ostree commit.

Second, the underlying full disk name (e.g. fedora-coreos-30.20190716.1-qemu.qcow2 definitely is not part of the OSTree commit as that would require per-platform commits).

That would provide a uniform public interface to answer the question "from which OS version did this node start?", even if the underlying disk/FS is replaced via Ignition.

This is a uniform public interface, right?

even if the underlying disk/FS is replaced via Ignition.

As I noted above, if this lands we can certainly change the in-progress rootfs redeploy PR to carry forward this information too.

@cgwalters
Copy link
Member Author

(We could also stick this file in /ostree too I guess which would mean it's naturally lifecycled with that...I am liking that idea and will do soon if no one objects)

@cgwalters
Copy link
Member Author

Example build:

[core@coreos ~]$ cat /sysroot/.coreos-aleph-version.json 
{
	"build": "30.20190923.dev.2-2",
	"ref": "fedora/x86_64/coreos/testing-devel",
	"ostree-commit": "93244e2568e83f26fe6ab40bb85788dc066d5d18fce2d0c4a773b6ea193b13c5",
	"imgid": "fedora-coreos-30.20190923.dev.2-2-qemu.qcow2"
}
[core@coreos ~]$ 

@cgwalters
Copy link
Member Author

cgwalters commented Sep 23, 2019

We had a realtime chat about this, I think the summary is in:

kaeso[m]> @kaeso:matrix.org and, if possible, "let's try to have a single owner for each key"
10:40 that's why I was suggesting to use the ostree-metadata for everything that is platform and build independent
10:41 and, for the remaining missing keys, inject only those in the image, in a dedicated place
10:43 like, we are already injecting the platform ID, but I'm not sure how to answer the "which was platform when this node was first installed"
10:44 (and from that I can directly lookup all the image details in the external metadata)

It's worth elaborating on all of this because it sheds light on the current build process and design.

A core issue here is that we can't add this metadata to the ostree commit itself, because that would cause the commit hash to differ, and further, by default ostree will end up garbage collecting the original commit object.

Now, ostree does have detached metadata which we could use for this, but it'd be a bit weird to do so I think because that metadata wouldn't exist in the upstream repo. (Using detached metadata would also require not GC'ing the original commit)

We also discussed the fact that the imgid includes the platform ID - but note that in 4.2 there's already a case where we have two images (metal-bios and metal-uefi) that have the same platform ID. Further, it seems likely at some point we'll e.g. add a metal-4k-sector image but that would have the same metal platform ID.

Or to summarize: This PR is only adding information which does not exist in the current image. (Or exists but is subject to change/GC)

@cgwalters
Copy link
Member Author

Updated to add a comment describing the keys.

@cgwalters
Copy link
Member Author

I guess though a question is whether it's worth trying to plumb through support for an "original commit" into OSTree. It's possible; but crosses several layers in the design. OSTree for example doesn't have support for "tags" or "immutable refs" which we'd clearly want for this. And even if we did that I think it'd be an extension/enhancement to this PR; we can't store the metadata in commit objects, and detached metadata is weird for this.

@lucab
Copy link
Contributor

lucab commented Sep 23, 2019

@cgwalters thanks for summarizing, these are all interesting points.

From my side, as an OS consumer, I'd need some aleph-version document to answer the following questions:

  • on first boot, what was the OS version?
  • on first boot, what was the updates stream?
  • on first boot, what was the basearch?
  • on first boot, what was the platform?
  • on first boot, what was the artifact this node booted from?

The reason I need those information is to use them as foreign keys in order to walk the *COS metadata, e.g. starting from https://builds.coreos.fedoraproject.org/prod/streams/testing/releases.json.

@cgwalters
Copy link
Member Author

The reason I need those information is to use them as foreign keys in order to walk the *COS metadata, e.g. starting from https://builds.coreos.fedoraproject.org/prod/streams/testing/releases.json.

Soo...I think such a thing would require having the fedora-coreos-pipeline pass extended metadata down into cosa. Something like cosa build --metadata=stream=testing-devel; and actually gets into another interesting issue around "official" vs "unofficial" builds. Today running cosa build locally produces images named fedora-coreos-* for example.

What do we expect to happen with our services like pinger when doing a cosa run from a local build?

We can try to work through those things; I'm a bit uncertain whether we should try to solve all of that before landing this PR. Not opposed, but given we're just adding JSON and it's easily extensible, I'd lean towards doing it as a second round after we've thought about some of the above issues.

@cgwalters
Copy link
Member Author

(Very quickly here on the above topic; I've been thinking about having cosa build automatically inject a -dev suffix, so it'd be fedora-coreos-dev-* and only omit it with cosa build --official or something)

@jlebon
Copy link
Member

jlebon commented Sep 23, 2019

Soo...I think such a thing would require having the fedora-coreos-pipeline pass extended metadata down into cosa. Something like cosa build --metadata=stream=testing-devel; and actually gets into another interesting issue around "official" vs "unofficial" builds. Today running cosa build locally produces images named fedora-coreos-* for example.

What do we expect to happen with our services like pinger when doing a cosa run from a local build?

We can try to work through those things

The commit messages and linkbacks in this PR should mostly explain the current strategy on how FCOS deals with this today. The TL;DR is:

  • local dev builds have a .dev suffix to their versions, so we know they're not official (the fact that the testing-devel and bodhi-updates streams built in CentOS CI have .dev suffixes right now is due to the fact that we're not yet driving versioning)
  • Pinger and Zincati are disabled based on this
  • Pinger and Zincati are also disabled on all the streams except testing (compare the manifest.yaml across all three branches)

Note also we do inject the stream name through the commit metadata (see coreos/fedora-coreos-config#110). We can do this because we don't promote OSTree commits, we promote fedora-coreos-config content + lockfiles (which reminded me that I forgot to implement one piece of feedback you had around that: coreos/fedora-coreos-releng-automation#42).

src/create_disk.sh Outdated Show resolved Hide resolved
It will be very useful in the future to be able to more rigorously
know the state that a given machine *started* from.  For example,
small tweaks like `chattr +i /sysroot` are things that won't
happen for in-place updates.

The term "aleph" here means "start".

If we decide to introduce a mechanism (e.g. systemd
unit) that performs those changes even for old in-place installs,
it could be useful to know exactly what the starting state was.

Note this ends up in the *physical* storage root `/` which
appears as `/sysroot` when booted.

Closes: coreos/fedora-coreos-tracker#170
@cgwalters
Copy link
Member Author

The commit messages and linkbacks in this PR should mostly explain the current strategy on how FCOS deals with this today

OK right, thanks. Having the dev in the default git branch I guess is OK, but it also weds one to the idea of "stream git branches" and lockfiles etc.

Anyways, rebased 🏄‍♂️ and comments addressed!

@jlebon jlebon merged commit 9c0558f into coreos:master Sep 24, 2019
# like "exactly what mkfs.xfs version was used" we can do
# that via looking at the upstream build and finding the
# build logs for it, getting the coreos-assembler version,
# and getting the `rpm -qa` from that.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

finding the build logs for it, getting the coreos-assembler version, and getting the rpm -qa from that.

Is that true though? Most likely except for recent builds, the cosa image would have long been GC'ed already. We should probably include a buildroot.txt file or something in the build dir listing all the packages in the container.

Another approach is building cosa using lockfiles (and storing them in this repo), which has nice properties though I don't think it's worth the overhead right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants