Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[npm] Consider all installed versions when checking if a dependency is still vulnerable #5801

Merged
merged 18 commits into from
Oct 7, 2022

Conversation

bdragon
Copy link
Contributor

@bdragon bdragon commented Sep 28, 2022

Context

DependencySet is used ubiquitously during the manifest file-parsing phase of an update. It offers set semantics on the basis of dependency name and it has the property that when multiple instances of the same dependency (by name) are added, each is combined into the one held by the set and the latter's version is set according to an algorithm that roughly favors the lowest version.

One issue with this behavior is that when multiple versions of a dependency are used in a project, it's possible for versions other than the one divined by the dependency set (usually the lowest version) to be affected by a security vulnerability. As a result, Dependabot will report that an update is not needed because the dependency is not vulnerable, which is of course incorrect and confusing for users.

npm/yarn is an ecosystem in which it is possible to have multiple installed versions of a single dependency, and we have seen the misleading update-not-needed behavior occur. The goal of this PR is to stop this behavior for npm. A separate PR will follow to actually attempt the update when the vulnerability is on a version other than the one decided by the dependency set.

What's changing

ea27677 I refactored DependencySet so that internally it retains all added versions for each dependency it holds. To achieve this I changed the internal data structure from an array to a hash, and introduced a private helper class to store the versions and present a combined representation of the dependency using the same algorithm linked above. The existing public API and behavior remains unchanged and two new public methods are added to expose the new functionality, #dependency_for_name and #all_versions_for_name.

63b2569 When combining two dependency sets (a very common operation) care was taken to combine all tracked versions in the correct order to ensure that the combined representation of each dependency in the combined set would be the same as it is today.

7967e2f To make it possible to associate a dependency with all installed versions of it detected during the file-parsing phase, I added a new #metadata field (not included in #to_h serialization) to Dependency.

3ed012c and d8bcee7 Before returning the list of dependencies from the file parser, all installed versions of each dependency are injected into its metadata.

c57b5d7 With the preceding changes in place, I modified the update checker so that it looks at all versions of a dependency when checking if it's vulnerable.

Risks

DependencySet is a fairly central type, which makes this a fairly risky change. I'm placing a good deal of faith in our test suite and I appreciate your careful review!

@bdragon bdragon force-pushed the bdragon/2855-vuln-multi-version branch from 43c6a22 to 6b2b1f7 Compare September 28, 2022 17:33
@bdragon bdragon marked this pull request as ready for review September 28, 2022 18:29
@bdragon bdragon requested a review from a team as a code owner September 28, 2022 18:29
@jurre
Copy link
Member

jurre commented Sep 29, 2022

Overall this approach looks great! If I understand this correctly, with this change we will detect that a given dependency is still vulnerable, but will we also be able to then update it to a non-vulnerable version, or is that something that we'd need to tackle as a follow-up?

@@ -36,7 +37,8 @@ def parse
dependency_set = DependencySet.new
dependency_set += manifest_dependencies
dependency_set += lockfile_dependencies
dependencies = dependency_set.dependencies

dependencies = Helpers.dependencies_with_all_versions_metadata(dependency_set)

# TODO: Currently, Dependabot can't handle dependencies that have both
# a git source *and* a non-git source. Fix that!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is also an issue with the Terraform ecosystem because Dependencies using a Git source have a different internal identifier than Terraform registry dependencies.

Another thing that they may share in common now is, because npm and Terraform Dependencies can have multiple version requirements, the requirements_changed? method in Common needs to be overrided so that if one required version is already up-to-date, the requirements_changed? method still notices that the other required versions have been updated.

See #5786 (comment) for more info

(this should be private)

def requirement_changed?(file, dependency)
  changed_requirements =
    (dependency.requirements - dependency.previous_requirements) |
    (dependency.previous_requirements - dependency.requirements)

  changed_requirements.any? { |f| f[:file] == file.name }
end

@bdragon bdragon force-pushed the bdragon/2855-vuln-multi-version branch from ab9036b to 150d9d5 Compare September 29, 2022 18:13
Copy link
Contributor

@mctofu mctofu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

To successfully get a PR I think theres a few more checks we need to investigate/fix in the updater like

# Prevent updates that don't end up fixing any security advisories,
# blocking any updates where dependabot-core updates to a vulnerable
# version. This happens for npm/yarn subdendencies where Dependabot has no
# control over the target version. Related issue:
# https://github.com/github/dependabot-api/issues/905
if job.security_updates_only? &&
updated_deps.none? { |d| job.security_fix?(d) }
return record_security_update_not_possible_error(checker)
end
.

That work would be for a future PR but if we find that this change converts jobs that previously return "not vulnerable" to an "unknown error" or a PR that doesn't fully fix the vulnerability we should get those to return "security_update_not_possible" instead before shipping this.

@jurre jurre mentioned this pull request Oct 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants