-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move OSTree commit into build directory #515
Conversation
Requires: coreos/rpm-ostree#1829. |
4a3bdfb
to
9e7f2a0
Compare
OK, this is working pretty well! Though I still need to adapt all the other commands that read in things from the OSTree repo. |
if [ "${commit}" == "${previous_commit}" ]; then | ||
cp -a --reflink=auto "${previous_builddir}/ostree-commit.tar" . | ||
else | ||
ostree init --repo=repo --mode=archive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should keep around cache/repo-archive
. This way we don't pay the recompression penalty every build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or hmm...if we teach anaconda to pull from the bare-user
repo directly then we don't even need an archive repo at all, but I think that may run into issues with xattrs and 9p; we could also do a custom webserver that did dynamic gzip -1
for objects and faked up being an archive repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should be good with this now. pull-local
between archive repos just hardlinks (given that tmp/
is on the same filesystem as builds/
, which I think we unofficially assume right now).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, let me sketch out the flow now with this latest revision:
rpm-ostree compose tree
commits intotmp/repo
, so it performs compression there during the internal "pull-local
" when importing from the temporary bare-user repo.- by default, we keep the last 3 builds in
tmp/repo
, so during new composes we only pay for compression of new objects (and if the repo is wiped out, we reseed it from the last build). - prepping the final
ostree-commit.tar
is apull-local
between two archive repos, so those are just hardlinks.
# Don't compress; archive repos are already compressed individually and we'd | ||
# gain ~20M at best. We could probably have better gains if we compress the | ||
# whole repo in bare/bare-user mode, but that's a different story... | ||
tar -cf ostree-commit.tar -C repo . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simple enough. But one downside of this is that today libostree doesn't know how to pull "tarball of ostree repo" and I can see wanting that. We could teach it...wouldn't be too hard. But maybe worth considering using a static delta?
Actually in general I think we should make this a config option as I'm not sure we necessarily want it for RHCOS since we have the oscontainers. (Maybe. Data is cheap assuming we have GC)
(And making it a config option would raise the question of a config file)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually in general I think we should make this a config option as I'm not sure we necessarily want it for RHCOS since we have the oscontainers.
Hmm, how about a --delete-ostree-repo
flag to oscontainer
to delete the repo and just leave the ostree-commit-object
behind once the container is successfully built?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also just leave the repo directory, but make the commit partial so we just keep the same logic when searching for the commit object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So now that we do keep the ostree-commit-object
again, let me reword and amend my previous proposal: since oscontainer
produces something very similar already (a tarball containing the repo), how about (1) we make it ${workdir}
aware so rather than passing a repo path, we pass it a build id (or default to latest) like other commands (see also this comment about this), and (2) give it a --consume
flag which deletes the tarball once it's done?
9e7f2a0
to
0f75ec2
Compare
OK, updated this now to just keep the OSTree repo as a directory in the build dir. One result of this is that we drop the It also makes it clear that we're not recompressing objects between builds. I.e. we only ever pay compression costs for new objects for each build when we |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so this flips around the tradeoffs; the archive repo can be fetched directly, but having a repository-per-build is going to take a much bigger hit storage-wise. The small files aspect of OSTree repos is going to be multiplied.
Uploading in a naïve way to S3 won't take advantage of hardlinks for example...and uploads are going to be slow.
And everyone who is using OSTree repositories to deliver content is going to want to sync content to a centralized repo. How that's done on top of this is possible but feels nontrivial; feels like one would be fighting the system.
See #159 (comment) for some numbers around these concerns. Disk size and upload speeds themselves aren't actually that much worse. But managing the repos themselves in S3 does become much more painful. Going to convert this back to use tarballs now... sorry for the flip-flopping around this. 🌀 At least we have a better understanding now of the tradeoffs! |
0f75ec2
to
4ecdfd0
Compare
OK, this is ready for review now! Really, of the The other big user of course is |
a201d1f
to
0189537
Compare
interested in cutting a new release of rpm-ostree rpm to get |
New release pending in coreos/rpm-ostree#1841. |
0189537
to
df3bcf5
Compare
I haven't drafted release notes nor tagged it upstream yet, though I did build it with the intent of tagging it into the continuous repo ahead of time. But it looks like the ppc64le build failed with an interesting error. Anyway, will look into this tomorrow, but at least for x86_64 you can use https://koji.fedoraproject.org/koji/taskinfo?taskID=34984167 to test out this patch. |
that's obviously not what we want though :) |
Yeah, hence the Anaconda issue. That said this would be somewhat easy to patch via libguestfs afterwards (just edit the |
Yeah. I was confused before. I thought the issue was that it didn't work at all (i.e. anaconda error) and not that it just didn't set up the ref right.
Yeah we can even do this in the kickstart %post. We've done it in the past for FAH: https://pagure.io/fedora-kickstarts/blob/f29/f/fedora-atomic-updates-base.ks#_12 |
another thing to note is that without a parent we lose our |
Right now, we were passing `${commit}` to Anaconda in the buildextend-metal case, but that meant that even if the ref wasn't temporary, we would end up with a checksum refspec. We were also passing `${ref:-${commit}}` for the qcow2, which would always evaluate to `${ref}` since even if the manifest defines no ref, we create the temporary one. Instead, always give `${ref}` to `run_virtinstall` and add logic there to "dereference" the ref only if it's temporary. This is the case for RHCOS, though it doesn't actually matter much for that one since it "prepivots" images anyway. Though this does fix the case of FCOS bare metal images having a checksum refspec. This does regress the `buildextend-metal` case when targeting a build that's not the latest by feeding it the latest ref instead of a specific commit. This will be fixed in an upcoming patch.
Prep for next patch.
28ceb74
to
b909dc7
Compare
OK, Anaconda/ |
b909dc7
to
5bf4921
Compare
OK, this now requires coreos/rpm-ostree#1844. I initially went down the road of teaching Hmm, actually, this is an issue that exists today too with a central |
5bf4921
to
036197f
Compare
this LGTM - one suggested change to add a comment (I found this useful when testing this change). I'm about to send the rpm to the continuous repo so the new COSA will be built soon and will have |
Rather than keeping OSTree data separately in the toplevel `repo/`, make it part of the build directory. This solves a bunch of issues and makes things conceptually clearer. See discussions in: coreos#159
036197f
to
64e89d7
Compare
most recent changes LGTM |
Note for those of you following along at home you'll need to update your COSA container (pull from quay.io) after this change. |
Minor regression from coreos#515.
Don't rsync `repo/` back and forth anymore since it's no longer used in the latest cosa: coreos/coreos-assembler#515
Don't rsync `repo/` back and forth anymore since it's no longer used in the latest cosa: coreos/coreos-assembler#515
Don't rsync `repo/` back and forth anymore since it's no longer used in the latest cosa: coreos/coreos-assembler#515
Rather than keeping OSTree data separately in the toplevel
repo/
, makeit part of the build directory. This solves a bunch of issues and makes
things conceptually clearer.
See discussions in:
#159
Still testing this, there might be some more backcompat polish needed.