Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change the curation provider API to take packages instead of ids #6387

Merged
merged 1 commit into from
Feb 8, 2023

Conversation

sschuberth
Copy link
Member

@sschuberth sschuberth commented Jan 24, 2023

Please have a look at the individual commit messages for the details.

Note
This is a breaking change in the sense that the PackageCurationProvider interface now takes a collection of Packages instead of Identifiers.

@sschuberth sschuberth changed the title Curation pkg api Change the curation provider API to take packags instead of ids Jan 24, 2023
@sschuberth sschuberth force-pushed the curation-pkg-api branch 2 times, most recently from 1f24b32 to 6171770 Compare January 24, 2023 12:00
@sschuberth sschuberth marked this pull request as ready for review January 24, 2023 13:39
@sschuberth sschuberth requested review from a team and oheger-bosch as code owners January 24, 2023 13:39
@sschuberth sschuberth force-pushed the curation-pkg-api branch 2 times, most recently from 2f3fe47 to f764b59 Compare January 26, 2023 07:04
@fviernau
Copy link
Member

fviernau commented Jan 26, 2023

Hm, prior to this change I believe in ORT (code) we made the assumption that package IDs are unique and refer to a specific package. Most importantly, if a single ID refers to multiple different packages (e.g. forks), the concept was that this had to be solved outside of ORT. That mentioned assumption is not just reflected in the package curation provider API, but almost all over the place, ORT's curation matching, curation data, package configurations, all kind of functions to do "things by ID".

Whether we move away from the previous concept IMO is a decision with large impact in terms of complexity and besides that it is not obvious at all what implications this in terms of alignment / implementation we have to do for implementing that consistently. One thing I don't like is that curations (by id) obtained from a provider would not be applicable globally anymore, but would be valid only locally in the given ORT result (in which we, as by this PR, still assume IDs to be unique). This alone IMO has huge implications on automation potential and complexity.

...therefore, I'd propose we discuss the larger topic with @oss-review-toolkit/core-devs, before getting back into the details of this PR.

@sschuberth
Copy link
Member Author

I'd propose we discuss the larger topic with @oss-review-toolkit/core-devs, before getting back into the details of this PR.

Sure, I'll put it onto the agenda. But I'd like to highlight that I believe you're over-interpreting the impact of this PR a bit: It's not about a general paradigm-shift that we must not identify packages solely by Identifier anymore. Instead, it's simply about giving curation providers access to more metadata, so that they can decide for themselves whether a curation is applicable or not. The fact whether they actually make use of more metadata than just the Identifer is an implementation detail of the curation provider.

@sschuberth
Copy link
Member Author

Meanwhile, I've split out the first two commits.

@fviernau
Copy link
Member

fviernau commented Jan 26, 2023

But I'd like to highlight that I believe you're over-interpreting

I don't think so, because I believe it introduces a mis-match in semantics / design, but sure I might miss something. Let's just discuss it.

@sschuberth
Copy link
Member Author

I don't think so, because I believe it introduces a mis-match in semantics / design, but sure I might miss something. Let's just discuss it.

But advisors operate on packages instead of ids now, too. And there you didn't complain about a "mis-match in semantics".

@sschuberth sschuberth changed the title Change the curation provider API to take packags instead of ids Change the curation provider API to take packages instead of ids Feb 6, 2023
@sschuberth sschuberth requested a review from a team February 6, 2023 19:55
mnonnenmacher
mnonnenmacher previously approved these changes Feb 7, 2023
Strictly speaking, a package id is not enough to query curations as
different servers might host different artifacts under the same id. A
typical example is an internal fork of an upstream artifact that is
internally hosted under the same id.

Currently, ClearlyDefined is the only curation provider that allows to
take this into account via its "provider" concept, see [1] and [2] for
related discussions. So, as a preparation for ORT to replace the
hard-coded providers [3] with real ones determined based on URLs, change
the curation provider API to take whole packages instead of only their
ids, so that implementations have access to artifacts and VCS URLs.

[1]: #155
[2]: package-url/purl-spec#33
[3]: https://github.com/oss-review-toolkit/ort/blob/33531c7/model/src/main/kotlin/utils/Extensions.kt#L38-L57

Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants