-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce the concept of a meta-package to PurlDB #186
Comments
NOTE: this is tracked now in #308This is actually something that recently dawned upon me as well and I have been thinking about this for quite some time. I already warned @pombredanne that I would be leaving a very long description of my thoughts, so here it is. When you look at the purlspec ( https://github.com/package-url/purl-spec ) you can see that a purl has (at least) 7 components (or actually, at least 6, as the first one is always While I think that when talking about a specific instance of a package purl is the right way to describe it, it is not how people think about packages. Let's look at an example from the purlspec:
This describes the binary RPM package from a version of Fedora for a particular architecture. This package was built in a certain way, with a certain configuration, in a certain environment, and possibly with some patches applied to the source code tree before it was built. There could also be a similar package for a version of Debian. This would NOT describe the exact same package (as it was built in a different environment, with a different configuration and possibly with different patches) but it is a related package. What relates the two packages is that they derive from the same basis, namely the So all these purls (the Fedora package, Debian package and original source code archive) are related to each other, but they are not identical. But this is not how people conceptually think about "a package". They will refer to the Fedora RPM as "curl", to the Debian deb as "curl" and to the original source code archive as "curl". This is not necessarily wrong, but also not necessarily right (as explained above). If instead there would be a meta package for "curl" then all of the purls (Fedora RPM, Debian deb, source code archive) can be seen as instances of the meta package "curl". These instances could have associated facts (for the lack of a better word) describing certain aspects of the fact which might or might not be correct ("facts" that could be extracted from the RPM metadata: location of the VCS, location of the webpage, package name, and so on). The above example is a bit simple and straightforward, so let's throw in a few more complex examples, starting with renaming packages. There are distributions that rename packages. The most straightforward example is Debian that uses lower case names for all of its packages by convention (along with some other things, like replacing hyphens with other characters). A renamed package would still be an instance of a "meta package". A slightly more radical example: in Debian the Another more difficult example would be GCC: from the GCC code base many different packages are created, which the GCC 13 page on Launchpad shows: https://launchpad.net/ubuntu/+source/gcc-13 Wrapping up: I think that the idea of a "meta package" is great, as this is how people are used to talk about code. A meta package could have several instances which are described by purls that point to specific binary packages/source code archives, which in turn have facts (metadata) associated with them. The meta package could try to consolidate these facts (along with other facts from for example Wikidata) and/or present these to the user in a certain way. |
@armijnhemel Thanks for all the very useful and detailed comments, which are much appreciated! |
I opened #308 to track the notions of similar packages derived from the same upstream. The notion of meta-package here is more about tracking a package without a specific version with its defaults and the watches. |
This may be also part of a follow up to #373 |
Suppose that you want to "watch" a package to be sure that new versions of that package get populated in your PurlDB. In order to drive such a process (and others to be determined) you need the concept of a
meta-package
, a non-instantiated package without any version, that is more like a template than an actual package, so that you can indicate that you want to watch it, and configure the watch frequency and similar things. You could also keep the meta-package updated so that it always shows the very latest attribute values of that package, although that is probably a "stretch goal".One simple way to identify a meta-package would be to use the reserved keyword
meta
in the Version of the Package URL. So you would not be able to derive a Download URL for such an entry, but you might be able to find its "home" URL, which would support any process that looks for new versions.Just a few ideas. More details needed of course! Note that any client systems of the PurlDB would need to be aware that a meta-package is a special concept; perhaps this can be supported by enhancements to the APIs.
The text was updated successfully, but these errors were encountered: