Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move OSTree commit into build directory #515

Merged
merged 3 commits into from
May 24, 2019

Conversation

jlebon
Copy link
Member

@jlebon jlebon commented May 8, 2019

Rather than keeping OSTree data separately in the toplevel repo/, make
it part of the build directory. This solves a bunch of issues and makes
things conceptually clearer.

See discussions in:
#159

Still testing this, there might be some more backcompat polish needed.

@jlebon jlebon added the WIP PR still being worked on label May 8, 2019
@jlebon
Copy link
Member Author

jlebon commented May 8, 2019

Requires: coreos/rpm-ostree#1829.

@jlebon jlebon force-pushed the pr/ostree-in-builddir branch 2 times, most recently from 4a3bdfb to 9e7f2a0 Compare May 8, 2019 21:07
@jlebon jlebon added WIP PR still being worked on and removed WIP PR still being worked on labels May 8, 2019
@jlebon
Copy link
Member Author

jlebon commented May 8, 2019

OK, this is working pretty well! Though I still need to adapt all the other commands that read in things from the OSTree repo.

if [ "${commit}" == "${previous_commit}" ]; then
cp -a --reflink=auto "${previous_builddir}/ostree-commit.tar" .
else
ostree init --repo=repo --mode=archive
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep around cache/repo-archive. This way we don't pay the recompression penalty every build.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or hmm...if we teach anaconda to pull from the bare-user repo directly then we don't even need an archive repo at all, but I think that may run into issues with xattrs and 9p; we could also do a custom webserver that did dynamic gzip -1 for objects and faked up being an archive repo.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be good with this now. pull-local between archive repos just hardlinks (given that tmp/ is on the same filesystem as builds/, which I think we unofficially assume right now).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, let me sketch out the flow now with this latest revision:

  • rpm-ostree compose tree commits into tmp/repo, so it performs compression there during the internal "pull-local" when importing from the temporary bare-user repo.
  • by default, we keep the last 3 builds in tmp/repo, so during new composes we only pay for compression of new objects (and if the repo is wiped out, we reseed it from the last build).
  • prepping the final ostree-commit.tar is a pull-local between two archive repos, so those are just hardlinks.

# Don't compress; archive repos are already compressed individually and we'd
# gain ~20M at best. We could probably have better gains if we compress the
# whole repo in bare/bare-user mode, but that's a different story...
tar -cf ostree-commit.tar -C repo .
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simple enough. But one downside of this is that today libostree doesn't know how to pull "tarball of ostree repo" and I can see wanting that. We could teach it...wouldn't be too hard. But maybe worth considering using a static delta?

Actually in general I think we should make this a config option as I'm not sure we necessarily want it for RHCOS since we have the oscontainers. (Maybe. Data is cheap assuming we have GC)

(And making it a config option would raise the question of a config file)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually in general I think we should make this a config option as I'm not sure we necessarily want it for RHCOS since we have the oscontainers.

Hmm, how about a --delete-ostree-repo flag to oscontainer to delete the repo and just leave the ostree-commit-object behind once the container is successfully built?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also just leave the repo directory, but make the commit partial so we just keep the same logic when searching for the commit object.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So now that we do keep the ostree-commit-object again, let me reword and amend my previous proposal: since oscontainer produces something very similar already (a tarball containing the repo), how about (1) we make it ${workdir} aware so rather than passing a repo path, we pass it a build id (or default to latest) like other commands (see also this comment about this), and (2) give it a --consume flag which deletes the tarball once it's done?

@jlebon jlebon force-pushed the pr/ostree-in-builddir branch from 9e7f2a0 to 0f75ec2 Compare May 13, 2019 20:43
@jlebon
Copy link
Member Author

jlebon commented May 13, 2019

OK, updated this now to just keep the OSTree repo as a directory in the build dir. One result of this is that we drop the ostree-commit-object completely. Adapting commands like buildextend-installer becomes trivial as well since we just need to point it to the build dir repo (rather than previously having to potentially untar first).

It also makes it clear that we're not recompressing objects between builds. I.e. we only ever pay compression costs for new objects for each build when we pull-local between the bare-user repo and the archive repo.

Copy link
Member

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so this flips around the tradeoffs; the archive repo can be fetched directly, but having a repository-per-build is going to take a much bigger hit storage-wise. The small files aspect of OSTree repos is going to be multiplied.

Uploading in a naïve way to S3 won't take advantage of hardlinks for example...and uploads are going to be slow.

And everyone who is using OSTree repositories to deliver content is going to want to sync content to a centralized repo. How that's done on top of this is possible but feels nontrivial; feels like one would be fighting the system.

src/cmd-build Outdated Show resolved Hide resolved
src/cmd-build Outdated Show resolved Hide resolved
@jlebon jlebon mentioned this pull request May 15, 2019
@jlebon
Copy link
Member Author

jlebon commented May 15, 2019

OK, so this flips around the tradeoffs; the archive repo can be fetched directly, but having a repository-per-build is going to take a much bigger hit storage-wise. The small files aspect of OSTree repos is going to be multiplied.

Uploading in a naïve way to S3 won't take advantage of hardlinks for example...and uploads are going to be slow.

See #159 (comment) for some numbers around these concerns. Disk size and upload speeds themselves aren't actually that much worse. But managing the repos themselves in S3 does become much more painful.

Going to convert this back to use tarballs now... sorry for the flip-flopping around this. 🌀 At least we have a better understanding now of the tradeoffs!

@jlebon jlebon force-pushed the pr/ostree-in-builddir branch from 0f75ec2 to 4ecdfd0 Compare May 16, 2019 18:58
@jlebon jlebon marked this pull request as ready for review May 16, 2019 18:58
@jlebon jlebon removed the WIP PR still being worked on label May 16, 2019
@jlebon jlebon changed the title WIP: Move OSTree commit into build directory Move OSTree commit into build directory May 16, 2019
@jlebon
Copy link
Member Author

jlebon commented May 16, 2019

OK, this is ready for review now! Really, of the buildextend commands, only buildextend-installer was directly digging into the repo. Adapted that.

The other big user of course is oscontainer, though the script actually takes in the repo path itself, so we'll want to change the pipeline code calling it to point to ./tmp/repo instead of ./repo. Or we can change it here so it just assumes $PWD is ${workdir} and make it use the same function as buildextend-installer, which has fallback logic in case it's not called directly after a cosa build.

@jlebon jlebon force-pushed the pr/ostree-in-builddir branch 4 times, most recently from a201d1f to 0189537 Compare May 19, 2019 17:34
@dustymabe dustymabe self-requested a review May 20, 2019 16:11
@dustymabe
Copy link
Member

error: Unknown option --no-parent

interested in cutting a new release of rpm-ostree rpm to get --no-parent in? we can fast track it using the continuous tag

@jlebon
Copy link
Member Author

jlebon commented May 21, 2019

New release pending in coreos/rpm-ostree#1841.

@jlebon jlebon force-pushed the pr/ostree-in-builddir branch from 0189537 to df3bcf5 Compare May 21, 2019 21:42
@jlebon
Copy link
Member Author

jlebon commented May 21, 2019

I haven't drafted release notes nor tagged it upstream yet, though I did build it with the intent of tagging it into the continuous repo ahead of time. But it looks like the ppc64le build failed with an interesting error. Anyway, will look into this tomorrow, but at least for x86_64 you can use https://koji.fedoraproject.org/koji/taskinfo?taskID=34984167 to test out this patch.

@dustymabe
Copy link
Member

ostree://fedora-coreos:809177f33a136c549777e770b821df71aa7092540661ecc49f857a1845c519b3

that's obviously not what we want though :)

@cgwalters
Copy link
Member

cgwalters commented May 22, 2019

Yeah, hence the Anaconda issue.

That said this would be somewhat easy to patch via libguestfs afterwards (just edit the .origin file), even if not very elegant.

@dustymabe
Copy link
Member

Yeah, hence the Anaconda issue.

Yeah. I was confused before. I thought the issue was that it didn't work at all (i.e. anaconda error) and not that it just didn't set up the ref right.

That said this would be somewhat easy to patch via libguestfs afterwards (just edit the .origin file), even if not very elegant.

Yeah we can even do this in the kickstart %post. We've done it in the past for FAH: https://pagure.io/fedora-kickstarts/blob/f29/f/fedora-atomic-updates-base.ks#_12

@dustymabe
Copy link
Member

another thing to note is that without a parent we lose our pkgdiff entry from the output build meta.json.

jlebon added 2 commits May 23, 2019 11:13
Right now, we were passing `${commit}` to Anaconda in the
buildextend-metal case, but that meant that even if the ref wasn't
temporary, we would end up with a checksum refspec.

We were also passing `${ref:-${commit}}` for the qcow2, which would
always evaluate to `${ref}` since even if the manifest defines no ref,
we create the temporary one.

Instead, always give `${ref}` to `run_virtinstall` and add logic there
to "dereference" the ref only if it's temporary. This is the case for
RHCOS, though it doesn't actually matter much for that one since it
"prepivots" images anyway.

Though this does fix the case of FCOS bare metal images having a
checksum refspec.

This does regress the `buildextend-metal` case when targeting a build
that's not the latest by feeding it the latest ref instead of a specific
commit. This will be fixed in an upcoming patch.
@jlebon jlebon force-pushed the pr/ostree-in-builddir branch from 28ceb74 to b909dc7 Compare May 23, 2019 15:30
@jlebon
Copy link
Member Author

jlebon commented May 23, 2019

OK, Anaconda/buildextend-metal should be fixed now! ⬆️
Looking at restoring pkgdiff.

@jlebon jlebon force-pushed the pr/ostree-in-builddir branch from b909dc7 to 5bf4921 Compare May 23, 2019 18:09
@jlebon
Copy link
Member Author

jlebon commented May 23, 2019

OK, this now requires coreos/rpm-ostree#1844.

I initially went down the road of teaching rpm-ostree compose tree to still print a package diff even if --no-parent is given. But (1) it didn't quite feel right to do this, and (2) it doesn't actually do the correct thing: imagine doing a build, then a new build which creates a new OSTree, but fails the Anaconda run; the next build would compare against the "intermediate" commit rather than the pkglist from the last build.

Hmm, actually, this is an issue that exists today too with a central ostree/ repo. Teaching db diff to output JSON is just more explicit.

@jlebon jlebon force-pushed the pr/ostree-in-builddir branch from 5bf4921 to 036197f Compare May 23, 2019 21:08
@dustymabe
Copy link
Member

this LGTM - one suggested change to add a comment (I found this useful when testing this change). I'm about to send the rpm to the continuous repo so the new COSA will be built soon and will have --format=json. Let's merge tomorrow when we know that has happened.

Rather than keeping OSTree data separately in the toplevel `repo/`, make
it part of the build directory. This solves a bunch of issues and makes
things conceptually clearer.

See discussions in:
coreos#159
@jlebon jlebon force-pushed the pr/ostree-in-builddir branch from 036197f to 64e89d7 Compare May 24, 2019 16:26
@dustymabe
Copy link
Member

most recent changes LGTM

@dustymabe
Copy link
Member

Note for those of you following along at home you'll need to update your COSA container (pull from quay.io) after this change.

@dustymabe dustymabe merged commit de8f556 into coreos:master May 24, 2019
jlebon added a commit to jlebon/coreos-assembler that referenced this pull request May 27, 2019
jlebon added a commit that referenced this pull request May 27, 2019
jlebon added a commit to jlebon/fedora-coreos-pipeline that referenced this pull request May 27, 2019
Don't rsync `repo/` back and forth anymore since it's no longer used in
the latest cosa:

coreos/coreos-assembler#515
jlebon added a commit to jlebon/fedora-coreos-pipeline that referenced this pull request May 27, 2019
Don't rsync `repo/` back and forth anymore since it's no longer used in
the latest cosa:

coreos/coreos-assembler#515
jlebon added a commit to jlebon/fedora-coreos-pipeline that referenced this pull request May 28, 2019
Don't rsync `repo/` back and forth anymore since it's no longer used in
the latest cosa:

coreos/coreos-assembler#515
@jlebon jlebon deleted the pr/ostree-in-builddir branch July 6, 2020 20:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants