-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
discussion: Verifying package names #891
Comments
I'm not very familiar with npm, so I'm writing down my interpretation to check whether I'm understanding the issue correctly.
Therefore, it's possible to download a package In this scenario, wouldn't the combination of trusted builder and canonical source repository be sufficient to detect a malicious package? i.e., provenance metadata signed by a trusted builder would not describe the expected source repository for producing the foo.tgz tarball? |
This is correct. The package name is contained in the provenance ( So during verification (in particular of the publish attestation), should the verifier extract the package.json from the tarball to verify it's consistent with the provenance / publish attestation? Not doing so mean we must rely on the registry to do that verification - which today does not happen afaik. If someone downloads the tarball and install from tarball, the package name used during installation is effectively the one in the package.json - which may be different from the one in publish / slsa attestations. (Note: for installation by package-name, npm CLI does not make use of the package.json in the tarball - @ianlewis to keep me honest). Regardless of whether the registry does that verification, independent verifiers may want to verify provenance / publish attestation on their own, so as to keep the registry honest and detect problems; and to improve trust in this supply-chain metadata. Extracting the package.json means the tarball need sot be available, so a verification-as-a-service would not work well, for example. Hope this provides some clarification |
Well, a timely post https://blog.vlt.sh/blog/the-massive-hole-in-the-npm-ecosystem illustrating exactly what @ianlewis described |
I believe that is correct. It's a bit inconsistent between installing by package name where it can get the metadata from the registry at the same time as it downloads the tarball, and installing by tarball on the local machine where all it has is the |
I think in this instance it does make sense to verify package names, but I do not think we need to make that a stronger recommendation in general. In SLSA we trust platforms, verify artifacts and as part of the verification we evaluate provenance against the expectations for a package. As the npm ecosystem doesn't (yet?) have an architecture for forming expectations, the slsa-verifier must determine how best to form expectations based on its view of the npm ecosystem. Given the issue raised in this thread, it seems prudent that slsa-verifier's expectations for npm packages would include that the package name in the attestation matches the package name in the package's package.json. |
Based on my reading of the npm issue presented above, this arises when there is no package lock present. The inconsistency arises when packages are installed from local caches vs. from npm directly. If a future SLSA track were to try to address the best-practice of pinning dependencies, would we be able to bypass this issue? There might be a problem in generating the lock file due to the above use cases, but once the lockfile is generated, then that can be depended on by the build system. |
I don't think that's the problem. You may have a lock file but the name of the package from the API != name in package.json |
What is the actual problem that we are trying to resolve with the mismatch of package names between the API and the package.json? The ability to create an accurate provenance? I was thinking from the perspective of a specific build platform. If the build platform requires a lock file to be used, then that lock file can be reused for pulling dependencies (any name consistencies should already be "resolved" and represented in the lockfile itself). If a lockfile isn't used and the build environment is potentially "dirty" (i.e. there are local caches that are used resulting in the aforementioned behavior), then you have to be concerned about the mismatch between package names since the two methods of collecting names can result in different results. By respecting the lockfile, we respect what the package manager itself attempts to do. In the blog post, they claim that there are likely many non-malicious discrepancies in the wild already. I am not trying to indicate that a mismatch between the package.json and the API isn't an issue. Instead, if a build/build platform have the ability to completely define the dependencies (i.e. greater than SLSA v1.0 Build L3 ... maybe a future L4 around what was once hermetic builds) then the name mismatch issue present above should be mitigated. I am not convinced that it should be a role of a slsa-verification mechanism to patch potential issues with various languages' package managers. Being aware of this issue and being able to react to it are important when you are including additional dependencies, but as long as you have complete and accurate provenance and SBOM data, verification can happen based on the packages and versions which are actually included in an artifact. If there are malicious side effects of this behavior, then they can be appropriately identified. Detecting the presence of these mismatches can be inputs into a classification of potentially malicious packages which further investigation can then be leveraged to appropriately classify. |
"accurate" verification of provenance. If the publish attestation claims it's package A, installing the package should not end up installing a package under name B.
That's the discrepancy. What is the "package manager" (the CLI or the registry or both)? npm registry will install package P under the name A if the user types
That was our initial position, but we were not satisfied with the guarantees during verification, hence this issue. Thanks for sharing your opinion
In this scenario, which package name would be reported in the SBOM? The sha512 will uniquely identify it, but the name may be inconsistent. I suppose it should report the package name from the registry... but if SLSA provenance was used, it would report possibly another package name. Same for a lock file: lock file is one resolution by the package manager (taken from the API or the package.json depending on user's command) |
Apologies for the detour. I was coming at this from a perspective of a build platform which is only consuming npm packages and not one that is producing the packages (and therefore whose provenance and verification would be different). I realize that this was an inaccurate interpretation of the discussion. After re-reading the artifact verification, this mismatch seems like it would fall well in check expectations. Therefore, when verifying the provenance of an npm package, the expectation that the package name is consistent should be checked. The rationale for this would be to enable proper attribution of sources and clarity of the provenance subject. This would resolve the line of questions above around which name should be included. The fact that a package's behavior can change when installed from the registry or from a tarball, however, would seem to fall outside of provenance verification. As long as provenance can be appropriately associated, the behavior of a package (i.e. for determining potential maliciousness) is extra-verification and might be more relevant to the build process when consuming the published npm artifact.
When worded in terms of expectations being formed around the packages, I no longer think that this previous argument holds. We wouldn't be trying to patch issues with the package ecosystems in general. The precedence that this decision would set is that if there are multiple ways that a name can be defined in some package ecosystem then all names should be consistent in order to disambiguate provenance references and associations. |
I agree with @arewm's conclusion as to what the precent should be:
I'm still a little confused about what we intend to do for npm. Do we need to involve the publish attestation in this disambiguation? Doing so treats the npm registry as a definitive mapping from package digest to package name, and I don't know the npm ecosystem well enough to know if that's appropriate. Do people distribute npm packages by tarball without uploading them to the registry? If so, then we need a SLSA verification path for them. Separately, I don't like the idea of the SLSA verifier having to open the tarball and inspect its contents, which seems to be the proposal here (if I'm following the thread correctly). Doing so seems to be developing expectations on the fly to work around a deficiency in the package's provenance. The verifier already trusts the build platform to record build metadata faithfully in the provenance, so I don't see why we can't trust the build platform to record the package name in the subject's name (or URI) field. You can detect any changes to package.json because they would change the package's digest. Then the verifier needs to set an expectation on the package name, which they can verify against the provenance. Would this convention work, or am I missing something? |
This seems like a straightforward implementation bug: the user requests to install package A and it actually gets installed under name B. That should be fixed in the npm tooling, not worked around by SLSA. More specifically, the SLSA verification process forms expectations on the package name. It is assumed that the thing doing the verification knows what the package name is. What is described in this issue is a quirk in npm where it's not straightforward to tell what the name is. But that still seems npm-specific, rather than something with SLSA? |
Part of the issue is that during SLSA verification we have a tarball and not a package name so users may not actually specify their expectation of the package name anywhere. A workflow might include:
Nowhere in there does the user specify an expectation around a package name except perhaps implicitly via the URL it was downloaded from.
This doc seems to describe that users should have expectations about the package name but SLSA itself doesn't care (i.e. it only cares that the artifact matches). Is that an accurate interpretation of the meaning? Or do you mean by "It is assumed that the thing doing the verification knows what the package name is" that a SLSA verifier should be checking expectations about the package name?
I'm not sure verifying just the subject in the provenance matches the user's expectations matters in this case since the name and digest are provided by an untrusted build and the tarball could just install a totally different package anyway. If we trust the builder to set all the package names consistently then that sounds a lot like a SLSA verifier doesn't actually need to check any expectations about the package name at all. I think the questions we need to answer are:
|
Discussed in July 10 community meeting. Action items:
Would this address this issue? |
That would help, yes. I am not sure verification is always {artifact,package_name} though. When there is a registry (npm, containers, OS distros) I think it works. For standalone binaries that users build and want to run, there is no actual namespace, except if you consider |
I think we should be targeting cases that can be automatically verified, where there exists a well-defined package name. While there are other cases (e.g. you download a binary from a website, or someone hands you a binary and you execute it) where one might want to inspect the provenance and make a decision, I feel like that's not where our effort is best spent. |
That matches the guiding principles:
and helps those of us working on SLSA focus our efforts. |
I was thinking of web browser, IDEs and devices that contain software that auto-update themselves and don't have a package name per se. But even in these case, the resourceUri = package name so I think it still works. I did not realize the scope of SLSA had been reduced to "packaging platforms". Maybe I'm mis-interpreting the term "packaging platform" |
Discussed again in community meeting July 17, 2023. We will update the spec to give examples of how to form expectations around an artifact's associated package name. Separately, we think it's sensible for slsa-verifier to make sure that the package name in the tarball matches the one used by the registry. We don't see a good reason to let the two differ. Moving issue to backlog. |
I don't know how well it would work but I'd always imagined that in cases where there's not a solid package name per-se that the download URL might be usable instead? |
@TomHennen What are you proposing we replace with the download URL? The package name? |
Sorry, I think that was a bit of a non-sequitur and was just a response to "what do we do if there's not a package name". I guess I expect whomever is asking for verification should know the package name and the download url. (I'm assuming verification happens close to when the thing is downloaded, but that could be wrong). |
For some scenarios it might be necessary to verify a language ecosystem's package name (or other metadata) which requires inspecting the contents of the package artifact(tarball) itself.
For example, npm package provenance references the artifact by package name. The subject of the in-toto attestation is a purl referencing the package name with a sha512 of the package tarball.
If you run
npm install package.tgz
it will install the package with the name in thepackage.json
metadata located inside the tarball. This could open users up to attacks where users think they are downloading and verifying package A but are in reality installing (and potentially overwriting) package B.What should a SLSA verifier do (if anything) in this case? If verification is checking the source code repo, is that good enough?
The text was updated successfully, but these errors were encountered: