Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

main: add new --extra-repo flag #113

Merged
merged 3 commits into from
Feb 12, 2025
Merged

Conversation

mvo5
Copy link
Collaborator

@mvo5 mvo5 commented Jan 31, 2025

This commit adds a new flag --extra-repo that can be used
to point to a repository url that is added to the base
repositories when depsolving. Note that no gpg checking
will be performed for such repos as there is no way to
add gpg-keys (yet) via this mechanism.

This means that with a repo created with e.g. createrepo_c like

$ mkdir repo
$ (cd repo && dnf download hello)
$ createrepo_c ./repo

and a blueprint like:

[[packages]]
name = "hello"

a manifest is generated that gets hello from this local repo:

$ image-builder  --extra-repo file:$(pwd)/repo manifest qcow2 --distro centos-9 --blueprint ./bp.toml |jq|grep hello
          "path": "hello-2.12.1-5.fc41.x86_64.rpm",

Note that this is part of the base repositories so anything with a
higher version number will get pulled from the extra-repo, even
system libraries or kernels. Note also that this repository does
not become part of the image so after the image build all rpms
from there are not updated (unless of course the normal repos
have higher versions of them).

Note as well that there is no safeguard right now against adding
extra repos for the wrong version of the distro, i.e. one could
add an extra repo build against/for fedora-42 on a fedora-40 image
which most likely will break with bad depsolve errors. But that
is okay, this option is meant for advanced users and testing.

@mvo5 mvo5 force-pushed the add-extra-repo-path branch from e7fd6a6 to e3f4b23 Compare January 31, 2025 12:09
@ondrejbudai
Copy link
Member

Interesting! :) I have to admit I understood the idea we discussed on Friday differently. I thought that --extra-repo-path would point directly at a directory with a valid RPM repository, so no json file would be required. Something like this:

$ ls repo/
kernel-6.12.12-200.fc41.x86_64.rpm
$ createrepo_c repo
Directory walk started
Directory walk done - 1 packages
Temporary output repo path: repo/.repodata/
Pool started (with 5 workers)
Pool finished
$ tree repo
repo
├── kernel-6.12.12-200.fc41.x86_64.rpm
└── repodata
    ├── 50e28b423ec7735de890a5595e78e00f518c4526a43c558c869b75ce435455e7-filelists.xml.zst
    ├── 83d11f4db296527882fa299c248b7c040169fab21b1721b15febf78a59c3b6e5-other.xml.zst
    ├── dce892a822489ef6b782fc65178d630ddbe8f7338fe63a20ab982e507d0a27fa-primary.xml.zst
    └── repomd.xml

2 directories, 5 files
$ sudo image-builder build --extra-repo-path ./repo ...

This surely has some limitations: e.g. you cannot specify a gpg key. However, for quick and dirty local builds this is imo not important.

We can even support more than just local paths, e.g.:

  • --extra-repo-path https://example.com/repo
  • --extra-repo-path copr://osbuild/image-builder

@mvo5 mvo5 force-pushed the add-extra-repo-path branch from e3f4b23 to 6e38e78 Compare February 5, 2025 16:32
@mvo5
Copy link
Collaborator Author

mvo5 commented Feb 5, 2025

@ondrejbudai Thanks, love your comments here. I reworked this to support local dirs initially like you suggested. I dropped the json support and added comments how this could extended and a test (slightly annoying that I had to include a binary rpm, I would love to just build it as part of the test but it seems a bit cumbersome).

@mvo5 mvo5 marked this pull request as ready for review February 5, 2025 16:35
@mvo5 mvo5 changed the title main: add new --extra-repo-path flag main: add new --extra-repo flag Feb 5, 2025
@supakeen
Copy link
Member

supakeen commented Feb 5, 2025

I like the current approach for this PR, I'll take a look tomorrow at the code of it. I'm a tiny bit worried about the variety of names for additional repositories.

Copy link
Member

@supakeen supakeen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I generally like this approach, I'm not particularly super happy about it having to go through all the list images machinery but understand the why.

To make it more clear that this is a repository that is only used during build time it might be a good plan to name this --extra-build-repo instead; it's more specific and allows adding other repository usage on the commandline if we ever desire so.

I won't approve mostly due to that; but I can probably be easily convinced, just want to start the discussion on it as we've had previous chats about API design.


As an aside and per @ondrejbudai's comment, we probably also want to cover the usecase where someone points directly at an RPM instead of at a repository in which case we'd copy the RPM somewhere temporary and call createrepo_c ourselves.

Do we want to do this as an --extra-build-package or do we want to do this with a prefix such as file+rpm:// vs file+repo://?

cmd/image-builder/repos.go Outdated Show resolved Hide resolved
@mvo5
Copy link
Collaborator Author

mvo5 commented Feb 6, 2025

Thanks for the thoughts! I like --extra-build-repo (but also do not have a super strong opinion, I think the naming is tricky here and we need to explain this in the docs/README (and --help).

I also wonder if we should set the repo priority higher compared to the regular ones (but I already heard dnf is not super great with that and maybe it's not what users expect, leaving it to the user to get the version number is is probably what most expect anyway(?) + we can always add it later (--extra-repo-priority or something)

@mvo5 mvo5 force-pushed the add-extra-repo-path branch from 6e38e78 to 06e19fc Compare February 6, 2025 09:26
@mvo5
Copy link
Collaborator Author

mvo5 commented Feb 6, 2025

So I generally like this approach, I'm not particularly super happy about it having to go through all the list images machinery but understand the why.
[..]

I tweaked this now a little bit to make it (hoefully) less awkward but it still needs to go through some machinery.

As an aside and per @ondrejbudai's comment, we probably also want to cover the usecase where someone points directly at an RPM instead of at a repository in which case we'd copy the RPM somewhere temporary and call createrepo_c ourselves.

Do we want to do this as an --extra-build-package or do we want to do this with a prefix such as file+rpm:// vs file+repo://?

Yeah, this needs some thinking and may become a bit more involved because we will need to make sure that the rpm with exactly the right version gets installed, so (probably) a combination of creating a local repo+adding the package with the exact version to the blueprint and (maybe?) increasing the priority of the repo so that in case of the same version for a local and remote package we pick the local one. But lets cross this bridge when we reach it (an issue might be nice so that we don't forget).

@supakeen
Copy link
Member

supakeen commented Feb 6, 2025

So I generally like this approach, I'm not particularly super happy about it having to go through all the list images machinery but understand the why.
[..]

I tweaked this now a little bit to make it (hoefully) less awkward but it still needs to go through some machinery.

As an aside and per @ondrejbudai's comment, we probably also want to cover the usecase where someone points directly at an RPM instead of at a repository in which case we'd copy the RPM somewhere temporary and call createrepo_c ourselves.
Do we want to do this as an --extra-build-package or do we want to do this with a prefix such as file+rpm:// vs file+repo://?

Yeah, this needs some thinking and may become a bit more involved because we will need to make sure that the rpm with exactly the right version gets installed, so (probably) a combination of creating a local repo+adding the package with the exact version to the blueprint and (maybe?) increasing the priority of the repo so that in case of the same version for a local and remote package we pick the local one. But lets cross this bridge when we reach it (an issue might be nice so that we don't forget).

There's also the option that we do a direct RPM installation instead of going through the rest of the machinery.

@supakeen
Copy link
Member

supakeen commented Feb 6, 2025

So I generally like this approach, I'm not particularly super happy about it having to go through all the list images machinery but understand the why.
[..]

I tweaked this now a little bit to make it (hoefully) less awkward but it still needs to go through some machinery.

As an aside and per @ondrejbudai's comment, we probably also want to cover the usecase where someone points directly at an RPM instead of at a repository in which case we'd copy the RPM somewhere temporary and call createrepo_c ourselves.
Do we want to do this as an --extra-build-package or do we want to do this with a prefix such as file+rpm:// vs file+repo://?

Yeah, this needs some thinking and may become a bit more involved because we will need to make sure that the rpm with exactly the right version gets installed, so (probably) a combination of creating a local repo+adding the package with the exact version to the blueprint and (maybe?) increasing the priority of the repo so that in case of the same version for a local and remote package we pick the local one. But lets cross this bridge when we reach it (an issue might be nice so that we don't forget).

There's also the option that we do a direct RPM installation instead of going through the rest of the machinery though that likely also has problems :)

@thozza
Copy link
Member

thozza commented Feb 6, 2025

@ondrejbudai Thanks, love your comments here. I reworked this to support local dirs initially like you suggested. I dropped the json support and added comments how this could extended and a test (slightly annoying that I had to include a binary rpm, I would love to just build it as part of the test but it seems a bit cumbersome).

Take a look at https://github.com/osbuild/osbuild-composer/blob/bdd2014c4467e9325b29624439c98584addc681f/test/cases/api.sh#L230-L262

@thozza
Copy link
Member

thozza commented Feb 6, 2025

There's also the option that we do a direct RPM installation instead of going through the rest of the machinery though that likely also has problems :)

Installation would fail unless all of the package dependencies are already installed on the system. That's why it makes a lot of sense to depsolve the transaction with the custom repository.

@thozza
Copy link
Member

thozza commented Feb 6, 2025

Note that this is part of the base repositories so anything with a
higher version number will get pulled from the extra-repo, even
system libraries or kernels. Note also that this repository does
not become part of the image so after the image build all rpms
from there are not updated (unless of course the normal repos
have higher versions of them).

My understanding was that the plan is to add an option to specify extra repos that can be used only for the 2nd transaction - the user requested packages, not to add extra repo to the base repo list. The reason is that this is possible by overriding the base repositories, no? This PR makes the option to taint the base repos quite prominent and easy to use, which I thought was not desirable.

Copy link
Member

@thozza thozza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commit summary needs to reflect the rename of the option. I've added a few comments, but more importantly, there's a mismatch between what I expected the plan is and what this PR does. So let's discuss it...

cmd/image-builder/repos.go Outdated Show resolved Hide resolved
cmd/image-builder/repos.go Show resolved Hide resolved
@mvo5
Copy link
Collaborator Author

mvo5 commented Feb 6, 2025

Note that this is part of the base repositories so anything with a
higher version number will get pulled from the extra-repo, even
system libraries or kernels. Note also that this repository does
not become part of the image so after the image build all rpms
from there are not updated (unless of course the normal repos
have higher versions of them).

My understanding was that the plan is to add an option to specify extra repos that can be used only for the 2nd transaction - the user requested packages, not to add extra repo to the base repo list. The reason is that this is possible by overriding the base repositories, no? This PR makes the option to taint the base repos quite prominent and easy to use, which I thought was not desirable.

I may have misunderstood, maybe worth having a sync again with the involved parties to make sure we are all on the same page. One goal of this feature is to allow replacing e.g. the kernel or other system packages which (AIUI) requires that the extra-repo is added to the base repositories. This option here is (again AIUI) analogous to creating a /etc/image-builder/repositories/my-distro-version.json with the extra local repo added, i.e. when overriding the base repos via the /etc directory the behavior is the same(?).

@achilleas-k
Copy link
Member

achilleas-k commented Feb 6, 2025

Finally getting around to reading this PR and the original PR comment is confusing, probably from the edits (mixing hello and osbuild as package name examples, setting up a local repo but mentioning copr).

Unless we're expecting more changes, can you update the top comment to reflect what the PR is doing?

EDIT: It's also the commit message, so definitely update the description before finalising this PR.

Copy link
Member

@achilleas-k achilleas-k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good generally, for what it does, but I think we don't all have a clear idea of which way we want to go with this. Some of the ideas might require more work, so maybe let's figure out what we want (now and in the future). I think it's worth going through all ideas and figuring out if we want to (eventually) do all of them, then maybe we can start from the simplest and have a plan for the rest.

From what I'm seeing, we have at least the following to consider:

  • copr repos
  • local directory with repo metadata
  • local rpm file or directory of rpm files

I'll also throw in an extra "regular base URL or metalink without config" option (not my idea, I think it was mentioned, but I'm not sure).

I think we can figure out a nice way of representing all of these under the same option with different prefixes, that can also be specified multiple times:

ib build \
     --extra-repo=copr:@osbuild/osbuild \
     --extra-repo=baseurl:http://download.example/pub/fedora/linux/releases/42/Everything/aarch64/os/ \
     --extra-repo=metalink:https://mirrors.fedoraproject.org/metalink?repo=fedora-42&arch=aarch64 \
     --extra-repo=localrepo:/path/to/repo/created/using/createrepo_c/ \
     --extra-repo=localpath:/path/to/directory/with/rpms/ \
     ...

@mvo5 mvo5 force-pushed the add-extra-repo-path branch from 06e19fc to f591f99 Compare February 6, 2025 20:45
@thozza
Copy link
Member

thozza commented Feb 7, 2025

From what I'm seeing, we have at least the following to consider:

  • copr repos
  • local directory with repo metadata
  • local rpm file or directory of rpm files

I'll also throw in an extra "regular base URL or metalink without config" option (not my idea, I think it was mentioned, but I'm not sure).

I think we can figure out a nice way of representing all of these under the same option with different prefixes, that can also be specified multiple times:

ib build \
     --extra-repo=copr:@osbuild/osbuild \
     --extra-repo=baseurl:http://download.example/pub/fedora/linux/releases/42/Everything/aarch64/os/ \
     --extra-repo=metalink:https://mirrors.fedoraproject.org/metalink?repo=fedora-42&arch=aarch64 \
     --extra-repo=localrepo:/path/to/repo/created/using/createrepo_c/ \
     --extra-repo=localpath:/path/to/directory/with/rpms/ \
     ...

It depends on how easy we want to make certain use cases. Because a COPR repository is at the end of the day just a regular YUM/DNF repository with a baseurl (https://copr.fedorainfracloud.org/coprs/g/osbuild/image-builder-cli/repo/centos-stream-10/group_osbuild-image-builder-cli-centos-stream-10.repo).

Local directory with a repodata created using createrepo_c can be also pointed to using baseurl:file///<path>.

I'm not saying that we shouldn't make these use cases easier if it make sense, just pointing out that not doing so does not mean that one can't use them.

Pointing to an existing YUM / DNF .repo file would also be worth considering...

@supakeen
Copy link
Member

supakeen commented Feb 7, 2025

Let's remember the initial use case for which we started thinking about this: installing custom kernels, u-boot, etc (from COPR, or from local source) to build an image to test new patches.

As far as I know this must happen in the first transaction @thozza?

@thozza
Copy link
Member

thozza commented Feb 7, 2025

Let's remember the initial use case for which we started thinking about this: installing custom kernels, u-boot, etc (from COPR, or from local source) to build an image to test new patches.

As far as I know this must happen in the first transaction @thozza?

Yes, but my impression was that the ability to override repos with --datadir was sufficient. If --extra-repo is to be used for the first transaction, I'd suggest naming it differently. And also how we want to name the currently prominent use case of defining the extra repository for the user packages.

@supakeen
Copy link
Member

supakeen commented Feb 7, 2025

Let's remember the initial use case for which we started thinking about this: installing custom kernels, u-boot, etc (from COPR, or from local source) to build an image to test new patches.
As far as I know this must happen in the first transaction @thozza?

Yes, but my impression was that the ability to override repos with --datadir was sufficient. If --extra-repo is to be used for the first transaction, I'd suggest naming it differently. And also how we want to name the currently prominent use case of defining the extra repository for the user packages.

How do you feel about the --extra-build-repo name?

@thozza
Copy link
Member

thozza commented Feb 7, 2025

How do you feel about the --extra-build-repo name?

Not great 😇 as it sounds like an extra repo for the build pipeline...

Maybe let's have a short brainstorming session to determine what kind of repo use cases we have and deduce the options and their names from them?

supakeen
supakeen previously approved these changes Feb 11, 2025
Copy link
Member

@supakeen supakeen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with the current approach in this PR. Some tidbits can be followups and mostly documentation, thank you :)

mvo5 added a commit to mvo5/images that referenced this pull request Feb 11, 2025
This commit re-export the `LoadAllRepositories` helper. It was
unexported in osbuild#1179 but
that was a bit premature as it is still required for advanced
use-cases like osbuild/image-builder-cli#113
thozza
thozza previously approved these changes Feb 11, 2025
supakeen
supakeen previously approved these changes Feb 11, 2025
cmd/image-builder/repos.go Outdated Show resolved Hide resolved
@thozza
Copy link
Member

thozza commented Feb 12, 2025

I know we had a lengthy discussion about this PR yesterday, but I failed to realize (or forgot) that the option's argument will be just a single URL or path. This is quite inflexible WRT other repo settings, such as GPG check, GPG key, any authentication certs, etc.

Would it make sense to accept the repo.json config as the argument containing all these settings? Alternatively, name the current implementation, e.g., as --extra-repo-url? 😇

I am asking now since we probably don't want to change the meaning of the option afterward...

@mvo5
Copy link
Collaborator Author

mvo5 commented Feb 12, 2025

I know we had a lengthy discussion about this PR yesterday, but I failed to realize (or forgot) that the option's argument will be just a single URL or path. This is quite inflexible WRT other repo settings, such as GPG check, GPG key, any authentication certs, etc.

Would it make sense to accept the repo.json config as the argument containing all these settings? Alternatively, name the current implementation, e.g., as --extra-repo-url? 😇

Heh, this is actually how this all started, the initial implementation required a json file. Fwiw, I'm happy with --extra-repo-uri (or -url idc`

I am asking now since we probably don't want to change the meaning of the option afterward...

+100 - lets do this right.

@thozza
Copy link
Member

thozza commented Feb 12, 2025

Heh, this is actually how this all started, the initial implementation required a json file. Fwiw, I'm happy with --extra-repo-uri (or -url idc`

--extra-repo-uri would probably be a better name since this would make it natural to support other identifiers than just URLs.

@mvo5
Copy link
Collaborator Author

mvo5 commented Feb 12, 2025

Heh, this is actually how this all started, the initial implementation required a json file. Fwiw, I'm happy with --extra-repo-uri (or -url idc`

--extra-repo-uri would probably be a better name since this would make it natural to support other identifiers than just URLs.

I'm making this change now (and also did the change to build a test rpm instead of doing this binary copy from the first commit).

However (playing devils advocate) we /could/ also support --extra-repo ./path/to/repo and --extra-repo /path/to/rich-repos.json on the same option - it might be a bit confusing. OTOH if we add later --extra-repo in addition to --extra-repo-uri people will also (at least initially) wonder why we have two very similar options.

[edit: diff but I'm a bit on the fence if one (overloaded) option or two (very similar) options is better]

@thozza
Copy link
Member

thozza commented Feb 12, 2025

I'm making this change now (and also did the change to build a test rpm instead of doing this binary copy from the first commit).

However (playing devils advocate) we /could/ also support --extra-repo ./path/to/repo and --extra-repo /path/to/rich-repos.json on the same option - it might be a bit confusing. OTOH if we add later --extra-repo in addition to --extra-repo-uri people will also (at least initially) wonder why we have two very similar options.

[edit: diff but I'm a bit on the fence if one (overloaded) option or two (very similar) options is better]

Yeah, I'm also not 100% decided. The question I'd ask if we then want to allow specifying other properties (e.g. GPG check and GPG key) for repos specified only by the URI and how that would look like? Would it be an additional option?

But I can see that we could use something like with the filter specifies and defaulting to URI used as baseurl if no specifier is provided:

--extra-repo <URI-as-baseurl>
--extra-repo config:/path/to/config.json
...

basically building on top of #113 (review)

ib build \
     --extra-repo=copr:@osbuild/osbuild \
     --extra-repo=baseurl:http://download.example/pub/fedora/linux/releases/42/Everything/aarch64/os/ \
     --extra-repo=metalink:https://mirrors.fedoraproject.org/metalink?repo=fedora-42&arch=aarch64 \
     --extra-repo=localrepo:/path/to/repo/created/using/createrepo_c/ \
     --extra-repo=localpath:/path/to/directory/with/rpms/ \
     ...

And if the user needs to specify GPG keys and additional properties, they need to provide the config JSON... 🤷

@mvo5 mvo5 force-pushed the add-extra-repo-path branch 2 times, most recently from aa0efb6 to e44ea86 Compare February 12, 2025 11:25
thozza
thozza previously approved these changes Feb 12, 2025
Copy link
Member

@thozza thozza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks 🚀

mvo5 added 3 commits February 12, 2025 12:57
This commit adds a  new flag `--extra-repo` that can be used
to point to a repository url that is added to the base
repositories when depsolving. Note that *no* gpg checking
will be performed for such repos as there is no way to
add gpg-keys (yet) via this mechanism.

This means that with a repo created with e.g. `createrepo_c` like
```console
$ mkdir repo
$ (cd repo && dnf download hello)
$ createrepo_c ./repo
```
and a blueprint like:
```toml
[[packages]]
name = "hello"
```
a manifest is generated that gets hello from this local repo:
```console
$ image-builder  --extra-repo file:$(pwd)/repo manifest qcow2 --distro centos-9 --blueprint ./bp.toml |jq|grep hello
          "path": "hello-2.12.1-5.fc41.x86_64.rpm",
```
Note that this is part of the base repositories so anything with a
higher version number will get pulled from the extra-repo, even
system libraries or kernels. Note also that this repository does
not become part of the image so after the image build all rpms
from there are not updated (unless of course the normal repos
have higher versions of them).

Note as well that there is no safeguard right now against adding
extra repos for the wrong version of the distro, i.e. one could
add an extra repo build against/for fedora-42 on a fedora-40 image
which most likely will break with bad depsolve errors. But that
is okay, this option is meant for advanced users and testing.
This commit reworks the `newRepoRegistry` func so that its easier
to see that it is a variable so that it can be overriden by the
tests. In the tests we want to use the `testrepos` we get from
images and in the real implementation we want to use the full
repo loader with search-paths and extra repos.
@supakeen supakeen added this pull request to the merge queue Feb 12, 2025
Merged via the queue into osbuild:main with commit e2aeece Feb 12, 2025
30 of 32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants