Track relationships between packages derived from the same upstream #308

pombredanne · 2024-02-26T17:52:37Z

This is breaking things down from #186 (comment)
@armijnhemel wrote:

This is actually something that recently dawned upon me as well and I have been thinking about this for quite some time. I already warned @pombredanne that I would be leaving a very long description of my thoughts, so here it is.

When you look at the purlspec ( https://github.com/package-url/purl-spec ) you can see that a purl has (at least) 7 components (or actually, at least 6, as the first one is always pkg). The second component indicates a hint about the format of the package, such as rpm, deb, and so on.

While I think that when talking about a specific instance of a package purl is the right way to describe it, it is not how people think about packages. Let's look at an example from the purlspec:
pkg:rpm/fedora/curl@7.50.3-1.fc25?arch=i386&distro=fedora-25
This describes the binary RPM package from a version of Fedora for a particular architecture. This package was built in a certain way, with a certain configuration, in a certain environment, and possibly with some patches applied to the source code tree before it was built. There could also be a similar package for a version of Debian. This would NOT describe the exact same package (as it was built in a different environment, with a different configuration and possibly with different patches) but it is a related package. What relates the two packages is that they derive from the same basis, namely the curl source code archive, which can also be described using a purl.

So all these purls (the Fedora package, Debian package and original source code archive) are related to each other, but they are not identical. But this is not how people conceptually think about "a package". They will refer to the Fedora RPM as "curl", to the Debian deb as "curl" and to the original source code archive as "curl". This is not necessarily wrong, but also not necessarily right (as explained above).

If instead there would be a meta package for "curl" then all of the purls (Fedora RPM, Debian deb, source code archive) can be seen as instances of the meta package "curl". These instances could have associated facts (for the lack of a better word) describing certain aspects of the fact which might or might not be correct ("facts" that could be extracted from the RPM metadata: location of the VCS, location of the webpage, package name, and so on).

The above example is a bit simple and straightforward, so let's throw in a few more complex examples, starting with renaming packages. There are distributions that rename packages. The most straightforward example is Debian that uses lower case names for all of its packages by convention (along with some other things, like replacing hyphens with other characters). A renamed package would still be an instance of a "meta package".

A slightly more radical example: in Debian the httpd package was renamed to apache2, while Fedora uses httpd. Both are packages derived from the Apache httpd source code and thus are related and should not be seen as completely different packages. Instead, there could be an "Apache httpd" meta package that has both the Fedora and Debian packages as instances.

Another more difficult example would be GCC: from the GCC code base many different packages are created, which the GCC 13 page on Launchpad shows: https://launchpad.net/ubuntu/+source/gcc-13
These are very obviously not the same packages, but they were generated from the same source code, or subsets of the same source code, so they are related. Add to that all the different versions of GCC, and the different configurations they were built in (cross compilers, etc.) and you can see that it can get quite complex. Yet: still they are all related.

Wrapping up: I think that the idea of a "meta package" is great, as this is how people are used to talk about code. A meta package could have several instances which are described by purls that point to specific binary packages/source code archives, which in turn have facts (metadata) associated with them. The meta package could try to consolidate these facts (along with other facts from for example Wikidata) and/or present these to the user in a certain way.

The text was updated successfully, but these errors were encountered:

pombredanne · 2024-02-26T17:56:57Z

This is quite related to Repology's metapackage ... for instance https://repology.org/project/firefox/versions

mjherzog · 2024-02-26T19:18:15Z

Ubuntu has an implementation of MetaPackages - https://help.ubuntu.com/community/MetaPackages

armijnhemel · 2024-05-08T11:40:26Z

I guess that #373 is related.

pombredanne · 2024-05-08T14:07:30Z

@armijnhemel re:

I guess that #373 is related.

It was related... but that part has been moved to:

Enable calling d2d when collecting/indexing a package or on demand via the API #419

And the package set feature is closely related:

Package set #141 and Introduce notion of Package set #95

pombredanne mentioned this issue Feb 26, 2024

Introduce the concept of a meta-package to PurlDB #186

Open

armijnhemel mentioned this issue Feb 28, 2024

Get metadata for and scan debian packages from Purls #300

Merged

AyanSinhaMahapatra mentioned this issue Mar 15, 2024

Add debian ".udeb" support #345

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Track relationships between packages derived from the same upstream #308

Track relationships between packages derived from the same upstream #308

pombredanne commented Feb 26, 2024

pombredanne commented Feb 26, 2024

mjherzog commented Feb 26, 2024

armijnhemel commented May 8, 2024

pombredanne commented May 8, 2024 •

edited

Loading

Track relationships between packages derived from the same upstream #308

Track relationships between packages derived from the same upstream #308

Comments

pombredanne commented Feb 26, 2024

pombredanne commented Feb 26, 2024

mjherzog commented Feb 26, 2024

armijnhemel commented May 8, 2024

pombredanne commented May 8, 2024 • edited Loading

pombredanne commented May 8, 2024 •

edited

Loading