-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thin repositories #798
base: master
Are you sure you want to change the base?
Thin repositories #798
Conversation
@@ -2595,7 +2595,12 @@ jail_start() { | |||
|
|||
# May already be set for pkgclean | |||
: ${PACKAGES:=${POUDRIERE_DATA}/packages/${MASTERNAME}} | |||
: ${THIN_PACKAGES:=${POUDRIERE_DATA}/packages/${MASTERNAME}-thin} | |||
if [ ${SMALL_REPO} -eq 1 ]; then | |||
THIN_EXT=-small |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's confusing that the options are minimal and medium, but the extensions are -thin and -small. At a minimum the documentation should mention this. Ideally things should lineup throughout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes but minimal and medium are bad names, think and small are nice, but we can't use proper switches for them, I am open to any suggestion for renaming that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should spell out what these are rather than "thin" or "small". Too bad -f/-F is taken as -f[ilter]
makes sense to me.
-m all # default
-m listed
-m recursive / runtime # This only excludes build dependencies right?
listed/recursive/runtime don't make me(user) think about what "small" means.
There's also terms like 'leaf' and 'automatic' that describe these.
p.s. I cringe every time we add any new flag as it complicates the interface and makes it harder to clean up the interface later.
162328c
to
3513c3a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this but wonder what the use case is and what it buys us? Is this so a user can provide just like 1 custom package and use FreeBSD repo otherwise?
If so I don't think we have the proper infrastructure to make that work seamlessly. From a user perspective, If FreeBSD upgrades/deletes a dep I overrode I'm in trouble. If FreeBSD upgrades a package I overrode, but haven't upgraded, I am maybe in trouble if I forgot to lock or repo priorities go wrong. If a dep is upgraded by FreeBSD and is no longer compat then my installed overrode package breaks. Then there's the same assumption that the package was built from the same dependencies and versions that FreeBSD provides. It's an easy out-of-sync situation.
What's the situation without this feature? Isn't it just about the same as above?
I think what we really need is strong shlib compat saving/management and/or an alternatives-like system that supports multiple versions of packages being installed. In my fantasy world with unlimited space/bandwidth I'd just have fat packages that store their files in hashed dirs to reduce duplication, symlinked from readable named paths.
@@ -200,6 +200,10 @@ when using | |||
Do not skip dependent ports on findings. | |||
This flag is automatically set when using | |||
.Fl at . | |||
.It Fl m | |||
Build a minimal repository or thin repository which only contains the packages | |||
that has been listed to be built, note that this option is incompatible with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/has/have/
s/built, note/built.\nNote/
@@ -7701,15 +7707,69 @@ sign_pkg() { | |||
fi | |||
} | |||
|
|||
add_pkg_to_repo() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs quotes around pathed vars
@@ -2595,7 +2595,12 @@ jail_start() { | |||
|
|||
# May already be set for pkgclean | |||
: ${PACKAGES:=${POUDRIERE_DATA}/packages/${MASTERNAME}} | |||
: ${THIN_PACKAGES:=${POUDRIERE_DATA}/packages/${MASTERNAME}-thin} | |||
if [ ${SMALL_REPO} -eq 1 ]; then | |||
THIN_EXT=-small |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should spell out what these are rather than "thin" or "small". Too bad -f/-F is taken as -f[ilter]
makes sense to me.
-m all # default
-m listed
-m recursive / runtime # This only excludes build dependencies right?
listed/recursive/runtime don't make me(user) think about what "small" means.
There's also terms like 'leaf' and 'automatic' that describe these.
p.s. I cringe every time we add any new flag as it complicates the interface and makes it harder to clean up the interface later.
} | ||
|
||
build_thin_repo() { | ||
# Try to be as atomic as possible in recreating the new thin repo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've aggregated some comments for future lines here.
I don't think it's important to be atomic here as there is already a mechanism for it on by default. ATOMIC_PACKAGE_REPOSITORY
. The directory being acted on here shouldn't be the real repository in that case.
# ls -al /poudriere/data/packages/exp-11amd64-commit-test
total 88
drwxr-xr-x 7 root wheel 16 May 13 15:01 ./
drwxr-x--x 38 root wheel 43 Jun 4 14:58 ../
lrwxr-xr-x 1 root wheel 18 Oct 13 2017 .buildname@ -> .latest/.buildname
lrwxr-xr-x 1 root wheel 20 Oct 13 2017 .jailversion@ -> .latest/.jailversion
lrwxr-xr-x 1 root wheel 16 May 13 15:01 .latest@ -> .real_1589407304
drwxr-xr-x 4 root wheel 10 Apr 22 2020 .real_1587540631/
drwxr-xr-x 4 root wheel 10 Apr 25 2020 .real_1587845826/
drwxr-xr-x 4 root wheel 10 May 4 2020 .real_1588616190/
drwxr-xr-x 4 root wheel 10 May 13 14:58 .real_1589407113/
drwxr-xr-x 4 root wheel 10 May 13 15:01 .real_1589407304/
lrwxr-xr-x 1 root wheel 11 Mar 7 2017 All@ -> .latest/All
lrwxr-xr-x 1 root wheel 14 Mar 7 2017 Latest@ -> .latest/Latest
lrwxr-xr-x 1 root wheel 19 Mar 7 2017 digests.txz@ -> .latest/digests.txz
lrwxr-xr-x 1 root wheel 17 Mar 23 2020 meta.conf@ -> .latest/meta.conf
lrwxr-xr-x 1 root wheel 16 Mar 7 2017 meta.txz@ -> .latest/meta.txz
lrwxr-xr-x 1 root wheel 23 Mar 7 2017 packagesite.txz@ -> .latest/packagesite.txz
There's not much reason to disable that feature. We disabled on the cluster systems because pkgsync expects a certain hardlink count. That could be reworked.
The way it works is, for example, bulk
currently copies (cp -al
hardlinks) the .latest
into a .building
. Then runs build_repo
against that directory. Then later in bulk.sh commit_packages
is ran which flips everything around so .building
becomes the new real/latest directory.
Lots of ways to go about this but say for this feature we keep the All -> .latest
symlinks all alone. Add a /thin or /small dir that symlinks into a .real_TS_filtered directory. Then on client systems you add /thin to the pkg repo name. Generating the directory would just be copying (hardlinks) the .building directory to .filter and removing unexpected packages (which sounds like pkgclean
which is why I think most of this should be there).
git grep -F .building
has some of the relevant places.
if [ ${THIN_REPO} -eq 1 ]; then | ||
msg "Creating thin pkg repository" | ||
packages=${THIN_PACKAGES} | ||
else | ||
msg "Creating pkg repository" | ||
packages=${PACKAGES} | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This confuses me as a user. If I do a --thin
build it will update packages in both directories but only the 'thin' one has its pkg files updated. There's no clear path to resolve that or to filter an existing repository down to the small/thing (hint: pkgclean
).
local target=$2 | ||
[ -f ${target}/${pkgname}.${PKG_EXT} ] && return | ||
if [ ${SMALL_REPO} -eq 1 ]; then | ||
for dep in $(injail ${PKG_BIN} info -qd -F /packages/All/${pkgname}.${PKG_EXT}); do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have this information in some metadata files that could avoid the need to use pkg
here. Related to that I think this functionality makes sense in pkgclean
which generates the same metadata. Moving towards that method could wait until later though as I think it brings some risk/complexity to the build itself and I have some pending stuff to push around it all.
while mapfile_read_loop "all_pkgs" \ | ||
_pkgname _originspec _rdep _ignore; do | ||
if [ "${_rdep}" = "listed" ] ; then | ||
add_pkg_to_repo ${_pkgname} \ | ||
${THIN_PACKAGES}/All.new | ||
fi | ||
done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for pkgname in $(listed_pkgnames); do
add_pkg_to_repo "${pkgname}" "${THIN_PACKAGES}/All.new"
if [ -d "${THIN_PACKAGES}/All" ]; then | ||
mv ${THIN_PACKAGES}/All ${THIN_PACKAGES}/All.old | ||
fi | ||
mv ${THIN_PACKAGES}/All.new ${THIN_PACKAGES}/All | ||
if [ -d "${THIN_PACKAGES}/All" ]; then | ||
rm -rf ${THIN_PACKAGES}/All.old | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted earlier I don't think this careful swap is needed, or if it is we should use a symlink anyway.
|
||
build_thin_repo() { | ||
# Try to be as atomic as possible in recreating the new thin repo | ||
mkdir -p ${THIN_PACKAGES}/All.new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if this directory is leftover from a previous build? Should rm -rf
here probably. But as noted earlier I don't think we need to do this .new and .old thing anyhow.
-m -- minimal repository, only create a repository with the listed | ||
packages, incompatible with -a. | ||
-M -- medium repository, only create a repository with the listed | ||
packages and their runtime dependencies, incompatible with -a. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My default reaction to new flags in PRs is to suggest or ask if this could be a hook instead. For this particular feature it seems like it could be entirely a hook in pkgclean
. We could even provide this as an example hook.
bapt was mentioning its equivalency to a PPA repo, but the thin repos could also immediately solve some issues without some of the problems described later in your message. e.g., the drm kmod packages have no irrelevant runtime dependencies that could cause those kinds of problems, so we could build them in a thin repo on and for 12.2 users without much hassle. The medium repos address your concerns in the sense that they encapsulate all of the dependencies that could cause issues at runtime, and they solve brooks' problem that they just want to build and publish exactly what they want + run/lib depends for those to minimize the I/O work that pkg needs to do on some of their slower (emulators?) when searching the database or whatnot. |
As for the usage all what @kevans91 said, about could we make it a hook, yes we could, but imho there will be too many users of those feature that it deserves a proper support. Think all maintainers could provide their own packages more easily. kde-devel repo, libreoffice-devel repo etc. |
Our (CHERI) use case is very slow emulators (on par with FGPA implementations in the 10-50MHz range) with awful disk I/O. Reading a trivial pkg database can take minutes in some cases (I did an install of a clang/llvm/lld pkg and it took a couple hours). We've got access to some full-SoC FPGA emulations that are going to be even worse (though we probably won't be installing packages on them). |
3513c3a
to
6e5be7b
Compare
if the new -m argument is passed to the bulk command then a new repository with the same name as the regular one with -thin appended is created, it only contains the packages that has been listed to be built and nothing more. The repo creation is done on that new repo along with siging Note that this option is incompatible with -a
Unlike thin repo the small repos also includes the runtime dependencies and the pkg itself.
6e5be7b
to
68b3911
Compare
The release of 12.3 is coming up in about a month. I think this feature was envisioned as the solution to the incompatibility between kmod's built on 12.2, and people running 12.3. Is there a plan to finish this work? |
Stumbled upon this from #1025. This would be perfect fit for So I am highly in favor of this. |
This, unfortunately, didn't make it into 3.4.0 :-( |
it does not depends actuallt on #797 but I am too lazy to split that in to so will go in after #797 :D