Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates parsing of yarn.lock to use resolved URLs that are pulled from yarn and npm registries #926

Merged
merged 2 commits into from
Jun 24, 2022

Conversation

AMoo-Miki
Copy link
Contributor

Signed-off-by: Miki amoo_miki@yahoo.com

The changes include:

  • making the parser only look for resolved URLs in yarn.lock files
  • creating a fixture of yarn workspace with skeleton dependencies
  • updating integration tests in TestYarnPackageLockDirectory validate the results against the fixture
  • remove old yarn artifacts

Resolves

#925

@AMoo-Miki AMoo-Miki force-pushed the fix-yarn branch 3 times, most recently from 2cebe1e to 42c0c21 Compare March 29, 2022 19:58
@AMoo-Miki AMoo-Miki changed the title Replaces parsing of yarn.lock to be solely based on resolved URLs Updates parsing of yarn.lock to be use resolved URLs that are pulled from registry.yarnpkg.com Mar 29, 2022
@AMoo-Miki AMoo-Miki changed the title Updates parsing of yarn.lock to be use resolved URLs that are pulled from registry.yarnpkg.com Updates parsing of yarn.lock to use resolved URLs that are pulled from registry.yarnpkg.com Mar 29, 2022
…lock` is from `registry.yarnpkg.com`

Signed-off-by: Miki <amoo_miki@yahoo.com>
@AMoo-Miki AMoo-Miki changed the title Updates parsing of yarn.lock to use resolved URLs that are pulled from registry.yarnpkg.com Updates parsing of yarn.lock to use resolved URLs that are pulled from yarn and npm registries Mar 31, 2022
Copy link
Contributor

@wagoodman wagoodman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clever fix! 🙌

I think the only caveat I see is that using the resolved URL will only match in cases where there is a match with yarnpkg.com or npmjs.org which means that self-hosts/alternative hosting of packages won't be able to leverage the URL.

@wagoodman
Copy link
Contributor

Question about conflicts, from the existing test fixture:

c0n-fab_u.laTION@^7.0.0:
  version "7.7.7"
  resolved "https://registry.yarnpkg.com/something-i-made-up/-/c0n-fab_u.laTION-7.7.7.tgz#b9c2bf5805f1e64aadeed6df3a2bfafb5a73f5a0"
  integrity sha512-p32cOF5q0Zqs9uBiONKYLm6BClCoBCM5O9JfeUSlnQLBTxYdTK+pW+nXflm8UkKd2UYlEbYz5qEi0JuZR9ckSw==

What is the right value here for the package name? c0n-fab_u.laTION or something-i-made-up ? Should we require that the outer package name (in this case c0n-fab_u.laTION) be a substring of the name extracted from the resolved URL? And if it's not a substring, default to the outer package name?

@AMoo-Miki
Copy link
Contributor Author

AMoo-Miki commented Apr 1, 2022

I think the only caveat I see is that using the resolved URL will only match in cases where there is a match with yarnpkg.com or npmjs.org which means that self-hosts/alternative hosting of packages won't be able to leverage the URL.

Correct. For them, it will simply fallback to the name inferred from the identifier. If we find another registry with standardized naming conventions, the regex can be updated.

What is the right value here for the package name?

While I haven't seen a spec doc for the naming convention used, all the resolved URLs I have seen from NPM and Yarn registries have @scope/name/-/name in them. While I am sure there are millions of package-versions on these registries, I have tested the pattern against over 4000 of them. From what i gather, the first @scope/name is what the package confesses to be and the second name (everything after /-/) is to avoid conflicts when downloading the archives prior to an install.

Of course , short of a spec doc from the registries, my 4000 cases of validation are not proof enough that an NPM/Yarn URL with a different structure doesn't exist. However, I think it is a strong indication that we shouldn't expect to see them.

We have 3 options here:

  1. Continue to rely on @scope/name/-/whatever and handle problems when someone raises an issue.
  2. Change pattern to (@scope)/(name)/-/($2) which will fallback to the inferred name if whatever doesn't match name.
  3. Change pattern to (@scope)/(name)/-/(another-name) and simply throw a warning about the package name not matching what we expected but continue to use @scope/name with the hopes that user will raise an issue asking about it.

While I think option 1 is perfect because I don't care for "whatever", I think option 2 will not detract from it but option 3 will encourage feedback.

PS, I will look into the unit test failure.

@spiffcs spiffcs self-assigned this Apr 26, 2022
@spiffcs spiffcs added the ecosystem:javascript relating to the javascript ecosystem label Apr 27, 2022
@spiffcs
Copy link
Contributor

spiffcs commented Jun 23, 2022

Approved - I'm updating the unit tests on this now

Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
@spiffcs spiffcs merged commit d5e12ff into anchore:main Jun 24, 2022
spiffcs added a commit to jonasagx/syft that referenced this pull request Jun 27, 2022
* main: (70 commits)
  fix: add php catalogers to all catalogers (anchore#1065)
  feat: add use-all-catalogers flag (anchore#1050)
  Updates parsing of `yarn.lock` to use `resolved` URLs that are pulled from yarn and npm registries (anchore#926)
  remove OSS Meetup message (anchore#1057)
  add pom.xml cataloger (anchore#1055)
  Add support for CBL-Mariner distroless images (anchore#1045)
  Add catalogers configuration (anchore#1038)
  add template output (anchore#1051)
  update stereoscope to latest version (anchore#1052)
  update zip_read_closer to incorporate zip64 support (anchore#1041)
  Add pacman (alpm) parser support (anchore#943)
  Update of README.md (anchore#1027)
  bump cosign to v1.9.0 to resolve reporting of GHSA-66x3-6cw3-v5gj (anchore#1025)
  add workflows to test new project automation (anchore#1023)
  improve LanguageByName and add unit tests (anchore#1034)
  Read Description from dpkg status files (anchore#996)
  Add announcement for Anchore OSS Virtual Meetup (anchore#1033)
  add main module field to go bin metadata (anchore#1026)
  Add filters to package cataloger (anchore#1021)
  change draft to false for release process (anchore#1016)
  ...

Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
aiwantaozi pushed a commit to aiwantaozi/syft that referenced this pull request Oct 20, 2022
… from yarn and npm registries (anchore#926)

Co-authored-by: Christopher Phillips <christopher.phillips@anchore.com>
GijsCalis pushed a commit to GijsCalis/syft that referenced this pull request Feb 19, 2024
… from yarn and npm registries (anchore#926)

Co-authored-by: Christopher Phillips <christopher.phillips@anchore.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ecosystem:javascript relating to the javascript ecosystem
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants