Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bulk: allow prefetching binary packages from official repos #797

Merged
merged 22 commits into from
May 6, 2021

Conversation

bapt
Copy link
Member

@bapt bapt commented Nov 3, 2020

As soon as we have the entire list of packages to build but before
the sanity check, pre fetch all the binary packages that we can
from the official repositories.

The sanity check running after than will probably destroy some
packages for which the confiration will be different or any reason
suitable for the sanity check to destroy packages.

When listing the packages to fetch only list the one not already
existing to avoid pkg fetch to redownload and overwrite packages
we have already customized

src/bin/poudriere-bulk.8 Outdated Show resolved Hide resolved
Copy link
Contributor

@brooksdavis brooksdavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks super helpful for incremental builds on cloud-based poudriere servers.

src/bin/poudriere-bulk.8 Outdated Show resolved Hide resolved
@bapt bapt mentioned this pull request Nov 4, 2020
@brooksdavis
Copy link
Contributor

It would be nice if there was a way to specify a full URL rather than hardcoding pkg.freebsd.org. Supporting latest|quarterly|<URL> seems viable.

src/bin/poudriere-bulk.8 Show resolved Hide resolved
Comment on lines +117 to +127
.It Fl b Ar name
Specify the
.Ar name
of the binary package branch to use to prefetch packages.
Copy link
Member

@bdrewery bdrewery Nov 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-b doesn't make me think fetch at all. In bulk.sh it is -b branch which makes me think 'branch' rather than 'binary'; I'm saying that at a quick glance of the manpage I might think this is saying which branch to build from rather than anything to do with fetching. I don't really have a better idea.
We're getting to the point of needing to use things like --fetch. I had been thinking -s[eed] but -f -F -S are taken already and I had a plan to flip -S into -s IIRC.

Like Brooks comment I think this inevitably ends in someone asking for REPO_URL to be configurable so this flag interface may need to forward think about that.

src/share/poudriere/common.sh Outdated Show resolved Hide resolved
src/share/poudriere/common.sh Outdated Show resolved Hide resolved
Comment on lines 7397 to 7511
if [ -n "${PACKAGE_BRANCH}" ]; then
download_from_repo
fi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Need this to not run for non-build commands (like pkgclean) and to not run when resuming, and to still respect -c and -C. So I think it makes sense higher up (Github review won't let me comment on untouched lines.) :
if [ -n "${PACKAGE_BRANCH}" -a "${resuming_build}" -ne 1 ]; then
		download_from_repo
fi
# existing code
if [ ${JAIL_NEEDS_CLEAN} -eq 1 ]; then
			msg_n "Cleaning all packages due to newer version of the jail..."
			rm -rf ${PACKAGES}/* ${cache_dir}
  1. Then there is the question of what should -n (dry run) do? Should it fetch so that a proper accounting of what will build is done or should it just say "will try fetching pkgA pkgB ..."? I have no opinion here.

Comment on lines 2957 to 2959
[ -f ${MASTERMNT}/packages/All/${pkgname}.${PKG_EXT} ] || echo ${pkgname}
done
)| JNETNAME="n" injail xargs env -i ASSUME_ALWAYS_YES=yes pkg fetch -o /packages
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine for now but ultimately we may want to inspect more what will be downloaded to avoid delete_old_pkg from nuking the same packages every time for things like different options or flavors.

@bdrewery
Copy link
Member

bdrewery commented May 6, 2021

Remaining items I am fixing before merge:

  • pkg bootstrapping; pkg is fetched but then the build fails because symlink packages/Latest/pkg.txz is missing.
  • bulk -n (dry mode) still is fetching
  • bulk -C needs to only fetch dependencies and not the packages that are listed, or a second pass to delete listed ports
  • bulk -c: It is unclear what this should do. Should it just mirror the remote or still build listed ports?
  • testport support
  • poudriere.conf always-on support (documented)

Items for after:

  • Incremental build changes #822 is likely to be a problem here too due to out-of-sync remote and local tree. I've seen llvm* building when it probably could be skipped.
  • Need to compare options before fetching or every build will likely fetch and delete which wastes time and bandwidth. 1 differing option could cause thousands of unneeded packages to be fetched.
  • ignored ports are fetched
  • we fetch packages that have different dependencies. Such as using ports openssl locally and remote does not, we still fetch the packages and then delete them.

bapt and others added 12 commits May 6, 2021 12:55
As soon as we have the entire list of packages to build but before
the sanity check, pre fetch all the binary packages that we can
from the official repositories.

The sanity check running after than will probably destroy some
packages for which the confiration will be different or any reason
suitable for the sanity check to destroy packages.

When listing the packages to fetch only list the one not already
existing to avoid pkg fetch to redownload and overwrite packages
we have already customized
This is to allow things like HTTP_PROXY to passthrough.
@bdrewery bdrewery merged commit 3f3db55 into master May 6, 2021
@bdrewery bdrewery deleted the fetch-packages branch May 6, 2021 23:27
@bdrewery
Copy link
Member

A major pitfall is that local patches, overlays, EXTRA_PATCH_TREE, etc, are not considered. If it finds a remote package with the same options it will use that and never build with the custom patches. IMHO this is enough to remove the feature entirely as it is prone to cause security and POLA issues for users.

I could auto detect EXTRA_PATCH_TREE dirs and overlays dirs, but not directly dropped files. A blacklist could be used too, but there is no concise way to tell a user the feature is dangerous with custom patches, and implementing all of these detections is more work.

bdrewery added a commit that referenced this pull request May 13, 2021
bdrewery added a commit that referenced this pull request May 13, 2021
bdrewery added a commit that referenced this pull request May 13, 2021
For example this avoids downloading a package that uses base ssl
while we want ports ssl.

Issue #797
bdrewery added a commit that referenced this pull request May 13, 2021
evadot pushed a commit to evadot/poudriere that referenced this pull request Jun 12, 2021
evadot pushed a commit to evadot/poudriere that referenced this pull request Jun 12, 2021
evadot pushed a commit to evadot/poudriere that referenced this pull request Jun 12, 2021
For example this avoids downloading a package that uses base ssl
while we want ports ssl.

Issue freebsd#797
evadot pushed a commit to evadot/poudriere that referenced this pull request Jun 12, 2021
bdrewery added a commit that referenced this pull request Jul 13, 2021
Specifically the TRIM_ORPHANED_BUILD_DEPS feature can come into play after
trimming IGNORED packages, where some build deps are missing but we don't
actually need them as the packages requiring them are not queued and are
present.  Clean the queue after trimming ignored to handle this case which
reduces unneeded fetches.

This also means download_from_repo() should never come across a package
that is NOT missing a package file, otherwise why is it queued?

Issue #797
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants