-
Notifications
You must be signed in to change notification settings - Fork 239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: audit assertions #422
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems awesome.
What would be most ideal, combined with this, is a way for maintainers to get disclosures that these advisories impact their packages before muggles do, so that there’s a window in which the assertion can be made before any noise has been unnecessarily spread.
A couple of initial thoughts, but I will want to think on this approach a bit more from a holistic sense as well.
I think num 5 is my largest concern, all the rest are really implementation details. My first thought on 5 is that these opinions need to be published as structured packages on the registry. Maybe the default one is provided by npm, and is what these claims are registered against, but I know for a fact we will need to override this trust. I think this will be really important to iron out before this lands. |
- Do nothing: The current state of the world that has caused [pain](https://overreacted.io/npm-audit-broken-by-design/). | ||
- Allow reporters to mark which modules are impacted: not really scalable, especially since reporters won't be experts in the same way that maintainers are. | ||
- Do something in a neutral space, run by an independent organization: There's no reason this couldn't eventually be implemented under this API. It is contingent on that work being done and being successfully adopted by others. That shouldn't necessarily prevent us from implementing this and mapping it over to that later. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what this gets at is that there may other tools that will want to use the "asserted" data. The npm repository seems like a reasonable option for the data as it already the source of truth for the package itself. A few ideas off the top of my head:
- provide a way to get the asserted data for a module without using the npm client (existing UI?, REST enpoint?)
- Include the data when a module is installed/updated, that would like the local copy be used by other tools.
allow other tools to leverage the data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would assume this data would be included with existing package metadata and therefore accessible how people already access package metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree here mostly. To get the broader industry support I think this needs to be successful, we will need a well documented schema and set of api's. This feedback is something I have heard from multiple people.
I think the schema and api to load these counter claims needs to be platform agnostic. I think that even "publish this to the npm registry" should be an implementation detail for npm. Preferably npm would just build a client for this contract, or at maybe we need an "extensions" to the api so that npm could say "instead of a GET request to your provider server, install package foo
and read file claims.json
". Again, this is just rough ideas.
## Prior Art | ||
|
||
I've not seen other tools doing this, but I would be surprised if there's not prior art. After some searching, I can't find anthing. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@naugtur links to your work as prior art would be good here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm working on a clean schema to PR to pkg collab space. Maybe I'll finish on Friday.
I thought this RFC is kept as a historical record. I should create a new one or rewrite this totally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The closest that I'm aware of is Snyk's "Runtime prioritization":
Prioritize your fixes based on an analysis of the vulnerabilities that are called at runtime of the application and bear a higher risk
One way of defining "impactful" could be "is an affected function called by the dependency: if it isn't, then it's presumably not an 'impactful' vulnerability".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait, it's a different RFC 😅
I'll share links soon then. Thx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This explains plans and links to most of the stuff
https://dev.to/naugtur/do-you-need-help-with-your-npm-audit-3olf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And a direct link to the tool
https://www.npmjs.com/package/npm-audit-resolver
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Format for storing and exchanging assertions/decisions/resolutions/thingies
|
||
- The potential values for the `--assertions` filter on `npm audit` could probably be bikeshedded a bit. I'm unattached to the names. | ||
- UI in `npm audit` outpit could be bikeshedded as well. | ||
- It would be cool if there was the ability to have multiple maintainers run `npm audit assert` for a given `id` and surface that _multiple_ people signed off on the assertion as a way to begin combatting some forms of social engineering attacks we've seen in the past, but that might be too complex for now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good if there was a way for third parties to make assertions, and then end users decide which third parties they would trust.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree with this, but I could see a potential argument for it.
It seems like a product space that could be explored by companies (for example, Sonatype Nexus or jFrog Artifactory) but adding in additional complexity of "trust external authorities" to users who are already innundated with options is complexity that I'm not sure I want to make users have to care about. What happens when someone trusts giithub
rahter than github
, synk
rather than snyk
, or IḄM
rather than IBM
? There are solutions here but it's just unnecessary complexity in a solution that's inteded to help solve complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what I was meaning in my number 5 point above. Having multiple sources of trust here is also important. I even think we might need to go so far as to provide a way to trust opinions in a "scope". For example to say "I trust the express
team on modules provided by them, but not on the ones provided by the react
team which might be shared".
And to go even one step further, I think you would want to only trust the express
maintainers for transitives imported as part of their dep tree exclusivly. So if there was a shared dependency express > path-to-regexp
and react-router > path-to-regexp
then you would still need to report on path-to-regexp
even if the express
counter claim said it could be ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who do you trust is a crucial point here.
In general maintainers don't want to include vulnerabilities in their packages, but they do, on accident. That's why it's great that there are neutral people looking at these packages and creating CVE-s.
In the same fashion, most maintainers don't want to sign off dependencies as safe when they are not, but again: mistakes happen. We are trusting the same person not to make a mistake, but this time there is no third party to verify the claim (of course, we could have CVE-s listing false claims, but that would recreate the situation we have today, or even worse).
Not to mention that maintainers have strong incentives to sign off packages with minimal time spent looking into all the possible ramifications: if your package produces lots of npm audit
hits, you receive lots of issues. If you claim that all your dependencies are safe, you get less issues...
Apologies if I'm speaking out of turn here, but in regards this proposal at a high level, but I suppose specifically around this flag in particular (i think this is what @wesleytodd was getting at here maybe? - #422 (comment))
Isn't the nuance in Dan's post that it really all comes down to the runtime context of the vulnerability in question that matters the most? For example, if there is a critical vulnerability in lodash that is based on malicious user input; if Gatsby is using it for its dev server, then user input likely never enters the equation. But if I am using it in my client side and say supplying user input because i'm using it for a filter tab in my UI from user data, then the vulnerability would apply. So I guess, who is the intended user / audience of this command? Is it even limited to a certain type of user?
I would argue it's I guess I'm not sure if this RFC aims to reconcile that at all, or it is literally just for exposing a way to counter a claim, with the scope of who should run it, is not intended here. Or is the intent that this would be more of a crowd sourcing mechanism, so based on real user input, the severity of a vulnerability could be challenged? Or that the CVE itself would be updated for more context based on the incoming reports? Apologies for the questions, or if this isn't the right forum, happy to post this somewhere else. 👍 |
Package authors absolutely can know - just not always. |
@ljharb I am extremely against a system like this. It creates a weird caste system that's incredibly easy to game. There's a company that's done a similar thing, and it could not have been more end-user hostile and easy to get around to get early access to information. If security researchers disagree with me, I'm more than happy to concede to their recommendation. |
Co-authored-by: Jordan Harband <ljharb@gmail.com>
IMO the point of this command is to filter out the noise of "not-actionable" vulnerabilities for package consumers, so I view it as: 1 is the group best able to set the |
But how would they know all that and / or communicate that without knowing the usage context? (is what I'm not understanding here atm it appears.) Unless there is more nuance in this reporting structure / mechanism that I am missing? |
I totally agree that we don't want to create a "these people are special and get to see vulnerabilities first", but I definitely want to see if we can get maintainers (ideally first of direct dependencies) to signal if the dependency impacts them ASAP. I view the current noise problem as: $DEEP_DEPENDENCY_IN_POPULAR_PACKAGE is potentially affected by some vulnerability in $DEEPER_DEPENDENCY and today we notify everyone in $POPULAR_PACKAGE about things that often don't affect them. IMO, it's better to get the maintainer of $DEEP_DEPENDENCY to say "this does/doesn't affect my consumers" than to immediately send it to $POPULAR_PACKAGE's maintainers to try and sort out. |
Ah, I think I may be thinking of maintainer/direct/indirect differently, so I'll provide an example. Let's imagine that I maintain $PACKAGE (I think this is "direct"?) and you maintain $DEPENDENCY (I think this is "maintainer"), which I depend on. Let's also imagine that $POPULAR_PACKAGE (which is "indirect") depends on my $PACKAGE. If someone reports a vulnerability on #DEPENDENCY, you are definitionally impacted by this, so |
Just for clarification: The goal is to get maintainers (in a previous comment, group 1) to run this command to provide a signal to their users about whether or not a (transitive) dependency is impactful to a given module. |
Sorry, yeah, those labels weren't super explicit but I think we're mostly still talking about the same thing: some package (1) <- some meta framework (2) <- user of said meta framework (3) But I think @bnb 's comment helps some I think
Would be curious to know how this command gets scoped so that only that particular maintainer (user #2 in my fictional scenario) is indeed them and they are in the "right" to signal this claim. |
I think this already exists and is made up of the module authors already. This is also the same group @ljharb proposes to get the "early access" so I am not sure it this holds up. I totally understand where you are coming from and agree that a caste system is not what we want. Maybe a middle ground would be a grace period after the cve is registered in the database where it is only reported on if a flag is added? Even a 1 day grace here could do a lot to avoid the storm that can happen when a cve report goes live. |
After thinking about this more and reading all of these comments I think that the npm RFC portion of this work should a client built on top of some other systems. That said, if we were to choose a shorter term approach of proving out a method we plan to up level to a more industry applicable solution that would be fine to. I still think that we would want to design the short term solution in a way to actually prove out the larger concerns though, and in that regard this RFC is not yet all we need. So, to satisfy that I think we need a proposal for an agnostic api for a "source of trust" provider. This api should be simple to implement and scalable (think like how the plain documents work to form the registry api's we use). Additionally it should specify at least a starter schema for what a counter claim looks like. Once we have that, I think that an npm specific implementation were the source of trust could be a published package would be awesome. Additionally, a command in the cli would act as a client for this api. For the filing of a counter claim we would want the interface to provide all the necessary schema parts, but other than that I imagine we could leave the handling of that up to the implementers. In this way, we would get npm native control over how it manages it's own default trust but also enable third parties and enterprise use cases to have a clear path to implement their own counter claim ingestion model. A deep dive would be awesome to sort this all out, unfortunately I think I will have a difficult time making next week. I am on vacation and will be in Yellowstone where from what I hear any connectivity is hard to come by. If that timeline is what folks want, I can write up my thoughts as best I can, but I was hoping we could have a few discussions in the collab space before going too far on solutions. |
presumably if you can |
Gotcha that makes sense. So if Also, I think I understand a bit better the mechanism you are proposing now after your replies, so effectively from your example, running $ npm audit assert --module=fastify --id=CVE-xxxx --impactful=<boolean> The CVE in question then would be against one of fastify's dependencies (e.g. something from its package.json), not itself. If so, I wonder if there's any additional context / heuristics that could be gleamed from looking at package.json or package-lock.json / yarn.lock to provide additional context at the time
Some of that info could be useful for helping others become more aware of in what specific way fastify (in this case) is using said dependency, and thus if it even falls into the scope of the CVE (or at least help paint a more accurate picture of usage). Maybe with some sort of confidence score, much like dependabot now when submitting a PR to a repo. |
I've been doing some work with the National Telecommunications and Information Administration (NTIA) around Software Bill of Materials, and they've also been working on some requirements for sharing vulnerability exploitability which aligns with some of the goals of this RFC. They've been working with the CSAF 2.0 project in OASIS to contribute a VEX profile that allows you to communicate whether a particular vulnerability is exploitable in a particular software product. Looking at the information they capture the only thing missing from this RFC would be the assertion specifying why the vulnerability can't be exploited: "SHALL contain a a description why the vulnerability cannot be exploited". So as @wesleytodd suggested above I recommend that this information be captured as a required field. |
@bnb i'm not sure what you mean. there's already such a system - dependency maintainers versus end users. This is both correct and necessary. fwiw, CVE disclosure already works this way for the vulnerable package. I was only suggesting that this exclusivity be slightly loosened to include authors that depend on the vulnerable package. |
yes, the vulnerable package maintainers need to know their package is vulnerable. Not everyone can sign up randomly to be a maintainer of that package and get early access to the information before it's publish, presumably?
how do you implement this in a way that isn't effectively the exact same as publishing publicly? |
@iamwillbar I've added |
IIRC this came up in the Open RFC meeting discussion last week w.r.t. https://github.com/naugtur/npm-audit-resolver and #18. I have been thinking about this in the context of "I trust GitHub's security lab", "I trust my organization's infosec team", or even "I trust the maintainers of sufficiently large packages", but "I don't trust any arbitrary maintainer." IMO, the tricky thing is that the average npm user probably hasn't thought deeply about who they trust, so having sane defaults here is going to be key to actually reducing the noise. Potentially something like "maintainers of direct dependencies" (or even "trust maintainers by default" since you're already trusting them by default by using their packages, as brought up earlier in this discussion), top N consumers of a package, certain companies, etc. Individuals could override it with something like |
What about adding a categorization of vulnerabilities? Having categories for vulnerabilities affecting nodejs servers, affecting the browser, affecting code build tool, etc And then the users specify in their local package what exactly the app does and npm audit should only show the corresponding vulnerabilites eg: build a bundle for the browser locally => only show vulnerabilites affecting the browser Run a nodejs server => show those vulnerabilities Run a nodejs server where users can submit and build code dynamically => show all the vulnerabilites There isn't a ton of use cases for npm so I think it could easily fit in less than 5 categories to be defined |
There are innumerable use cases for npm, I can think of dozens of categories off the top of my head. |
@mhdawson I get concerned about reducing the viability of this feature by extending it to be configurable in this way. For example, if the ecosystem starts going down the path of recommending to ignore all maintainers' assertions then we're back in the situation we're in now. Further, thinking about this in the context of scalability I'm not sure if this is actually helpful: you and I may know that someone like Sindre is probably trustworthy/knowledgeable and coolGuyMaintainer420 who happens to maintain one module with 400k downloads maybe isn't. I don't think it's reasonable to expect a random infrastructure team at JetBlue to be able to reasonably make that trust judgement. This is explicitly why the My personal preference would be to ship with just
@asciimike what would you see the model for identity being here? How do I register I think this could be good but I'd want to have an airtight definition of the model we use for identification. One thing I have liked for a long time is the idea of publishing meta information that can be consumed by npm - so for example,
This also has the benefit of routing through the same registry mechanisms that npm already uses, so you'd be able to privately publish assertions and have them be usable internally with the same registry setup you already have. I do, however, think that shipping without this and then adding it in a later RFC would be the most ideal path both for the sake of complexity/velocity and for the sake of validating that we're going down the right path. |
This isn't really true. It is by far the largest programming ecosystem in the world, serving many different use cases. Static Analysis would be the most useful thing here, but unfortunately that is Very Hard in JavaScript and imo out of scope for the npm CLI. |
I don't quite follow how making it configurable causes this problem versus making it better. If you only have the choice between accepting all or ignoring all, then there is a higher chance that the recommendation could end up being to ignore all. In terms of scalability I would not expect every individual company to build up their own list, but instead rely on shared experience. For example maybe company X publishes what they use and other smaller companies rely on that. I will agree that it makes things more complicated. The question in my mind is if security organizations will adopt the "trust all" if the answer is yes, then that is the easiest/lowest friction. I would agree that it should be easy/low friction to say "trust all" if that is what you want. |
Domains may be a bad option due to difficulty of proving ownership :/ On some further thought, tying it to existing user accounts may be feasible. If someone asked me to build a system for GitHub, starting with GitHub GPG keys (e.g. I assume that npm could provide a similar mechanism (users upload their own keys, or keys are generated like how GitHub does it), and you could provide per-user, per-package, and per-organization attestation filtering. |
It makes the problem significantly less clear for end-users. Trust vs. no trust is an easy concept. Trust vs. find random sources to trust that are influenced by social capital and personalities vs. no trust is a weird space that massively complicates things. For example:
Compared to:
I think that for an initial feature, the latter is far more beneficial. It accomplishes multiple things:
|
As a last comment I will agree that capturing and getting information from the maintainers is a good first step and don't want to complicate initial steps from being taken :) |
One way of framing the problem that has been discussed so far in this thread is to think of each npm user as being in one of the following categories:
For users in category 1 and 2, the proposed changes have no adverse effect either way, since I already made a conscious choice to opt in or opt out of the npm registry (i.e. consent). If I imagine that I am in category 3, the proposed changes cause a subtle change for me, as previously I did not realise consciously that I was giving implicit consent to package maintainers to make code changes on my behalf, and now the scope of that consent is being increased to also encompass making good decisions about security on my behalf (?) - and this change in scope may happen without my knowledge when I start using the latest version of npm. If I am in category 4, then it's possible I am happy to trust maintainers under the terms of how npm currently works, but I would not be happy to trust maintainers under the terms of how npm works if the proposed changes were to go into effect. Does this pose a security risk or a significant ethical problem? I really don't think so. I think what it might mean is, if the changes are accepted, and there is general consensus that the scope of consent is in fact changing as a result, npm has some measure of responsibility to tell its users that it is happening so that they can make informed choices. |
I generally think this is an accurate representation. The point I'd make here is that maintainers are already implicitly making decisions about security on your behalf. For example, if I think there's a vulnerable path in
I generally agree with this and would expect quite a bit of communication through various avenues around it. |
As I have mentioned in this issue a. Maybe it's better to add something like the audit level, a boolean flag to specify if the issue will only effect a developer's machine. b. Or just advise that the audit errors in question are labelled by a low audit level (e.g. info) and that by default npm audit does not show errors with info level. |
That's not something that can be known by a package author. Nothing's always a dev dep, even eslint. |
Is this proposal to be implemented soon? |
There's been an interest expressed in the ecosystem of having some form of counterclaim for advisories surfaced by
npm audit
. There's been some discussion of a potential counterclaim mechanism for some time, but I've not seen a solution proposed.The intent of this RFC is to do that - propose a solution. I do not expect that this solution will go through unanimously and unchanged, but I'd like to get something up that can be talked about and addressed both by the ecosystem and by those thinking about Security in the registry and CLI. If the answer to this is "no" with some set of reasons, that's a perfectly reasonable outcome.
I'm particularly interested in feedback from security researchers on why this is / is not a good solution, and what alternatives might be implemented or what tweaks might be made to make it a better solution. I appreciate feedback from developers who care, but also want folks to keep in mind that this has the potential for massively negative impact should a vulnerability get through that should not get through.