Skip to content

Conversation

alilleybrinker
Copy link

@alilleybrinker alilleybrinker commented May 7, 2025

This adds an RFD describing a design for adding support for Package URLs to the CVE Record Format, including a deep dive on the thinking behind the design.

This RFD previously covered both Package URLs and OmniBOR Artifact IDs. After feedback from the CVE Board, that has been reduced in scope to only cover Package URLs. Another PR will be opened to cover the OmniBOR proposal.


This adds an RFD describing a design for adding support for more
software identifier types to the CVE Record Format, including
a deep dive on the thinking behind the design.

Signed-off-by: Andrew Lilley Brinker <abrinker@mitre.org>
@alilleybrinker alilleybrinker force-pushed the alilleybrinker/software-id-rfd branch from d61f397 to c1b0d8e Compare May 7, 2025 20:28
@alilleybrinker
Copy link
Author

alilleybrinker commented May 7, 2025

First comment reserved for open questions (will be edited)

  • Whether to vendor the purl specification.
    • Current sense of the group: yes, vendor the spec
  • Whether to vendor the OmniBOR specification.
    • Current sense of the group: yes, vendor the spec
  • Whether to permit the "generic" and "swid" types for purl.
    • Current sense of the group: disallow the "generic" and "swid" types
  • Whether to limit the number of software identifiers in a single CVE record.
    • Current sense of the group: do not limit the number, rely on overall CVE record size limit
  • Whether to permit versions directly in a purl. They are currently allowed in a CPE, but by convention if provided in a CPE then version constraint fields should not be used.
    • Current sense of the group: match existing CPE behavior
  • Whether to use the vers spec for specifying version constraints instead of reusing the cpeMatch version fields.
    • Current sense of the group: do not use the vers spec
  • Whether to add more software identifiers to the affected array rather than modifying the cpeApplicability object.
    • Current sense of the group: use the affected array

Issues considered resolve are marked with a checkmark. If you believe an issue is not resolved, please raise it in a comment below.

@alilleybrinker alilleybrinker changed the title feat: Add RFD for Software Identifiers RFD for Software Identifiers May 7, 2025
alilleybrinker added a commit to alilleybrinker/cve-schema that referenced this pull request May 9, 2025
The `affected` array is an array containing `product` objects, which
must at minimum include an "identifier" (which may be a composite
identifier composed of multiple fields) along with a set of version
bounds or a default status. Products may also specify an assortment
of additional fields which further constrain the applicability of the
CVE to its intended target hardware or software.

Previously, the set of identifiers available were:

- A `vendor` and `product`
- A `collectionURL` and `packageName`

This commit adds support for a new identifier, called `packageURL`,
which uses the purl (Package URL) specification. The contents of the
commit add this as a new field on the `product` type, with a description
and examples, and also update the data constraints on the `product`
type, both to make `packageURL` an option to fulfill the identifier
requirement already in place on the type, and to ensure that the new
`packageURL` field is not mixed with the existing `collectionURL` or
`packageName` fields, as they are redundant with `packageURL` and
including both increases the possibility of data inconsistency within
a single CVE record.

This inclusion of a new `packageURL` type which can be used instead of
the existing pair of `collectionURL` and `packageName` would require
consumers of CVE records to update their logic both to accept the new
field, and to use it in places where they may today use the pair of
`collectionURL` and `packageName`.

This commit does not include a regular expression to parse Package URLs
specifically. Rather, it reuses the existing `uriType` schema. So we
can be sure after validating CVE records against this updated record
format that the `packageURL` field is a URL, but not that it is a valid
Package URL per the Package URL specification. It would be the
responsibility of CVE Services to further validate the field to ensure
values match the Package URL specification. We do not perform this
validation in-schema due to the complexity of expressing the validation
in the form of a regular expression.

This work is submitted as an alternative formulation of the design
proposed in the draft RFD on software identifiers [1], and as an
alternative to the existing proposals for making the `cpeApplicability`
structure generic [2] (instead of it being CPE-specific) and enhancing
this new generic applicability structure with support for Package
URLs [3].

If this change is accepted, then [2] and [3] should not be accepted.

[1]: CVEProject#407
[2]: CVEProject#391
[3]: CVEProject#397

Signed-off-by: Andrew Lilley Brinker <abrinker@mitre.org>
alilleybrinker added a commit to alilleybrinker/cve-schema that referenced this pull request May 9, 2025
The `affected` array is an array containing `product` objects, which
must at minimum include an "identifier" (which may be a composite
identifier composed of multiple fields) along with a set of version
bounds or a default status. Products may also specify an assortment
of additional fields which further constrain the applicability of the
CVE to its intended target hardware or software.

Previously, the set of identifiers available were:

- A `vendor` and `product`
- A `collectionURL` and `packageName`

This commit adds support for a new pair of fields to support
using OmniBOR Artifact IDs as identifiers in the `affected` array:

- `artifactID`: The OmniBOR Artifact ID for an artifact.
- `artifactType`: An enum indicating whether the `artifactID` is for
  an artifact to search in a file system for, or whether it's a
  build input to search against OmniBOR Input Manifests.

The commit also adds data constraints to ensure this new identifier
pair is not used alongside fields that don't make sense to use with
OmniBOR, including the other identifier schemes, further decomposition
information like `programFiles` or `programRoutines`, and version
information.

This work is submitted as an alternative formulation of the design
proposed in the draft RFD on software identifiers [1], and as an
alternative to the existing proposals for making the `cpeApplicability`
structure generic [2] (instead of it being CPE-specific) and enhancing
this new generic applicability structure with support for OmniBOR
Artifact IDs [3].

If this change is accepted, then [2] and [3] should not be accepted.

[1]: CVEProject#407
[2]: CVEProject#391
[3]: CVEProject#396

Signed-off-by: Andrew Lilley Brinker <abrinker@mitre.org>
@alilleybrinker
Copy link
Author

I've now opened PRs reflecting an alternate design to the one proposed in this RFD, #409 and #410. If the QWG's consensus is to advance with those designs, I will close this RFD PR and open a new PR with an alternate RFD describing those designs in detail.

@darakian
Copy link

darakian commented May 29, 2025

Having both of these approaches open may not have been the best idea (for me anyway 😄 ). I've added my comment about how to encode version ranges in the other PR though I suspect that would also be relevant here as well.

I will also be totally honest that I have not dug into Omnibor yet and I've been stuck in a loop where a QWG meeting happens and Omnibor is mentioned, I think Oh shoot I forgot to read up on that, I take a quick look at it and get intimated by the spec, put it down and swear that I'll get back to it tomorrow,the next day, the day after that,etc...., and then there's another QWG. I won't be read up on OmniBOR by tomorrow's QWG, but please yell at me if I'm not y next weeks meeting 👍

So, with that personal failing out of the way I'll continue to be self centered and pose the question; would it make sense to drop both purl and omnibor from this RFD and discuss the idea of how we should consider software identifiers more generally? I've been waffling on the idea of proposing a set of what I'll call the identifiers that github uses which are similar to a subset of purls or not. I lean more toward not at this point, but I do think we should consider the generic question of how do we assess a new identifier should Omnibor2 or cpeButBetterThisTime, or whatever comes along. If nothing else this vetting process could be used as a way to provide feedback upstream. Success metric could also be easier to consider if the RFD is broken up as well. That said, I'm happy to hear pushback on that too as there really are not many software identifiers worth considering today and it may just be easier to have the full conversation all at once.


With respect to this RFD as is; I am onboard with the general direction of it. I think I understand the synonym problem to be one that is more operational than schema design and I would suggest that allowing for more methods to capture affected product information may enable/embolden more CNAs to publish affected product information period. In my opinion an indicator of success could be a rise in the proportion of CVEs published with valid affected product information populated. This could also be broken out per id (and for the structure itself) to indicate success for each.

On the Related Issues or Proposals section; this could be a case of yes and. CPE has its issues today and is currently bottlenecked with NIST as a centralized naming authority. I see no reason why CPE supporters couldn't continue development of CPE and federate the namespace based on a per-vendor basis or whatever and that be compatible with the adoption of purls/omnibors. I suspect that orgs will pick whichever namespace fits them and their needs best and IMO the spec should be equipped to accept whatever high quality identifiers can be produced.

@alilleybrinker
Copy link
Author

@darakian, for reading up on OmniBOR, the project website has a more accessible introduction to how the identifiers work: https://omnibor.io/docs/artifact-ids/

As for the question of whether to split this RFD into parts: one advancing a general set of provisions for how new software IDs should be incorporated, and then others advancing specific software IDs to incorporate. My team is open to doing that, but didn't as the initial ask because we were concerned it would be too granular for the QWG. In particular, we felt that offering concrete examples with real-world identifiers helps crystallize understanding of the trade-offs of any particular design in a way that a purely abstract proposal can't.

For the success metric question, I agree there's value in assessing adopting via a statistical analysis of uptake of the new fields or general enrichment of identity information in CVE records, though we didn't choose that as the go-to metric because we wanted to leave room to also consider adoption by CVE consumers, which is fuzzier to measure and thus easier to ignore when assessing success. That said, I'd love to see the kind of analysis you mention done after maybe six months post-adoption.

Finally, on the related issues, these are more listed to identify problems we are explicitly not solving in the RFD but which the QWG could take up and pursue outside of the RFD. I agree that there's interest and a need to advance some form of improvement to CPE, likely via federation, to let it scale beyond the limits of NIST's resources; we just don't solve that question here.

@alilleybrinker
Copy link
Author

@darakian we could amend the RFD text to include a clear subsection which describes that the design proposed here for adding Package URLs and OmniBOR Artifact Identifiers is intended as a template for addition of any future identifier types, which may help satisfy the desire for a clear reference-point on how to add those types in future proposals, without needing to split the RFD into multiple separate documents.

@darakian
Copy link

darakian commented Jun 2, 2025

@darakian, for reading up on OmniBOR, the project website has a more accessible introduction to how the identifiers work: https://omnibor.io/docs/artifact-ids/

Thank you, thank you!

As for the question of whether to split this RFD into parts: one advancing a general set of provisions for how new software IDs should be incorporated, and then others advancing specific software IDs to incorporate. My team is open to doing that, but didn't as the initial ask because we were concerned it would be too granular for the QWG.

Ya, that's totally fair. I was very much waffling back and forth on if to raise the issue or not.

@darakian we could amend the RFD text to include a clear subsection which describes that the design proposed here for adding Package URLs and OmniBOR Artifact Identifiers is intended as a template for addition of any future identifier types, which may help satisfy the desire for a clear reference-point on how to add those types in future proposals, without needing to split the RFD into multiple separate documents.

That could work for sure. 👍

This rewrites the core content of the RFD to base the
proposed new fields on the `affected` array instead of basing
them on the `cpeApplicability` object as the prior version of
the RFD did. The motivation and outcomes are generally unchanged,
but the specifics of the proposed edits are now different.

Signed-off-by: Andrew Lilley Brinker <abrinker@mitre.org>
@alilleybrinker
Copy link
Author

@darakian, thanks! I've amended the RFD to be based on the affected array and to include more explicit commentary on how it is intended to function as a template for the inclusion of future identifier types.

@Chris-Turner-NIST
Copy link

Finally getting a moment to read through this and realized that this may not have been covered in discussions yet...

If the intent is to create more generic places for various identifiers, it would make sense that part of this proposal should include deprecating the existing cpes array and include a new property (cpeMatchString?) that aligns with the approach proposed for PURL and OmniBOR.

I recognize that this would create two locations for CPE related data due to the current support for hasCPEApplicability, however, it would be a step in the right direction of normalizing the current structures and methodologies available within the affected array.

@alilleybrinker
Copy link
Author

@Chris-Turner-NIST I agree that it would be good both to eventually deprecate the cpes array and to introduce a field for CPEs similar to the support added for OmniBOR and Package URLs in this RFD. However, we purposefully omitted that issue in this RFD for a couple of reasons:

  1. Deprecations are harder to justify, would likely take longer to reach consensus on how to handle a deprecation proposal.
  2. The cpeApplicability block, added last year as an NVD-compatible mechanism for adding CPEs which is also semantically clearer than the cpes field in the affected array's product objects, complicates the story around "where CPEs go" in an CVE record. If the cpes field were deprecated and a new cpeMatch (or some other name) field were added in the same object, CNAs would still be presented with 3 places to put CPEs, with one deprecated.

All this to say, I fully endorse improving and simplifying handling of CPEs in the record format, and my personal preference is to do exactly what you propose. I just think it would make the most sense in a follow-up RFD.

@alilleybrinker
Copy link
Author

@Chris-Turner-NIST, I've opened an Issue recommending the creation of an RFD for improving CPE handling, based on your comment here. Happy for any additional input you may have on that! #421

Co-authored-by: Andrew Pollock <andrewpollock@users.noreply.github.com>
@alilleybrinker
Copy link
Author

Note

Final Comment Period

A Final Comment Period (FCP) has been called for this proposal. This is a final opportunity to raise new concerns with the proposal.

The FCP will close at 2pm PDT / 5pm EDT July 3rd, at the end of the Quality Working Group Meeting.

@alilleybrinker
Copy link
Author

Note

Final Comment Period Has Closed

The Final Comment Period (FCP) for this proposal has closed, and the proposal has been accepted by the QWG.

Per the RFD process rules, it will now advance to the CVE Board for consideration. The Board will make the final determination as to whether to adopt or reject the proposal.

@ccoffin ccoffin moved this to In Progress in QWG Work Board Jul 15, 2025
@ccoffin ccoffin moved this from In Progress to Review in QWG Work Board Jul 15, 2025
@alilleybrinker
Copy link
Author

After deliberation with the CVE Board, we've decided to split this RFD into two parts: one for Package URLs and one for OmniBOR Artifact IDs.

The plan will be to likely proceed forward with the Package URL RFD, while continuing to work on improvements to the OmniBOR Artifact ID RFD.

For the OmniBOR Artifact ID RFD, the design will be revisited, with an eye toward handling fine-grained identifiers like OmniBOR Artifact IDs in a distinct manner from the way course-grained identifiers are used in CVE Records today.

Per discussion in the QWG, this amends the RFD to clarify that the new
identifier fields being proposed are not able to fulfill the "identifier-like"
requirement in the `product` object inside the `affected` array. While this
may be changed in the future, for today it is the easiest path forward for
CVE data consumers, who could adopt the new fields if _desirable_ but would
not be obligated to do so.

Signed-off-by: Andrew Lilley Brinker <alilleybrinker@gmail.com>
@alilleybrinker alilleybrinker force-pushed the alilleybrinker/software-id-rfd branch from 09511d5 to 625cef0 Compare August 4, 2025 22:06
@alilleybrinker
Copy link
Author

Gah, seems I errored in my commit-making and rebased the last commit before the new one I just added. To be clear: that commit has not been edited, this only adds a new commit which splits out the OmniBOR parts of the original RFD. This RFD is now solely about Package URLs; a separate PR will be opened for OmniBOR.

@alilleybrinker alilleybrinker changed the title RFD for Software Identifiers RFD for Package URLs Aug 4, 2025
Since this RFD is now Package URL specific, this renames the RFD
file to reflect the new title.

Signed-off-by: Andrew Lilley Brinker <abrinker@mitre.org>
@alilleybrinker alilleybrinker changed the title RFD for Package URLs RFD: Support for Package URLs Aug 7, 2025
@alilleybrinker
Copy link
Author

From today's QWG discussion: there was consensus on the value of pursuing a validation library for CVE records, given that with the introduction of Package URLs it will no longer be sufficient to check that CVE Records pass JSON schema validation. This would need to be a discussion with the AWG, which @ccoffin has taken an action to start.

@pombredanne
Copy link

@alilleybrinker re:

From today's QWG discussion: there was consensus on the value of pursuing a validation library for CVE records, given that with the introduction of Package URLs it will no longer be sufficient to check that CVE Records pass JSON schema validation. This would need to be a discussion with the AWG, which @ccoffin has taken an action to start.

See the work we are doing to bring open source validation at package-url/purl-spec#296 repasted here for reference:

I am working on some extensive tooling and data for PURL validation!

See these many issues to track the work specifically on PURL validation:

In particular:

Also, help major FOSS foundations and registries adopt/use the tools:

(separately crates.io now displays PURL on each crate's page)

@andrewpollock so as you can se there is a lot of things planned. A shit load of work to do! Help is mucho wanted 👼

Originally posted by @pombredanne in package-url/purl-spec#296 (reply in thread)

@pombredanne
Copy link

@alilleybrinker tell me if I can join a future QWG call to present the gist of the work

@alilleybrinker
Copy link
Author

@alilleybrinker tell me if I can join a future QWG call to present the gist of the work

I'd certainly be interested to hear about it. CCing @ccoffin, one of the QWG Co-Chairs, who maintains the agendas.

@alilleybrinker
Copy link
Author

My team has drafted some initial guidance for CNAs and CVE Consumers, respectively, trying to answer key questions for how to use the new field: https://gist.github.com/alilleybrinker/5c38a0e176482f475b809b17156d5a5f

As the QWG/Board get into prepping for this to be released in (presumably) 5.2.0 of the Record Format, this could be a useful starting point for materials to share with stakeholders about the change.

@ccoffin
Copy link
Collaborator

ccoffin commented Aug 15, 2025

@alilleybrinker tell me if I can join a future QWG call to present the gist of the work

Hi Phillipe!

It would be great to hear from you regarding the new work! Our normal meeting time is 3-4 PM EST, but i think last time we setup a 10-11 AM EST slot that was more compatible with your time zone. Let me know what you prefer.

Chris

@pombredanne
Copy link

@ccoffin either time works for me. Next week is busy as I will be at https://osseu2025.sched.com but your regular time can work even then.

@ccoffin
Copy link
Collaborator

ccoffin commented Aug 21, 2025

@pombredanne OK sounds good! Shall we go ahead and plan for you to present in the regularly scheduled QWG meeting on Aug 28?

Signed-off-by: Andrew Lilley Brinker <abrinker@mitre.org>
@ccoffin ccoffin merged commit f9b3097 into CVEProject:develop Aug 21, 2025
1 check passed
@github-project-automation github-project-automation bot moved this from Review to Done in QWG Work Board Aug 21, 2025
@alilleybrinker alilleybrinker deleted the alilleybrinker/software-id-rfd branch August 21, 2025 18:29
@pombredanne
Copy link

@ccoffin Looking forward to the discussion today!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

6 participants