MeetingMinutes2021

We meet online on Tuesday at 16:00 UTC as a reference. See https://www.timeanddate.com/worldclock/meeting.html to get the time in your timezone.

Join us at https://meet.jit.si/VulnerableCode

The current meeting notes is at:

https://github.com/nexB/VulnerableCode/wiki/WeeklyMeetings

Here are the running meeting notes:

Meeting on Saturday 2021-12-04 at 10:30 UTC

Agenda:

Improver review
Speeding up the importer-improver structure migration
GitHub externships

Participants:

Philippe (@pombredanne)
Hritik (@hritik14)

Improver review

Quick reference to review is available at https://gist.github.com/Hritik14/d02a2c24a50e0afcaa219cc4bf8abef9

Speeding up the importer-improver structure migration

Hritik:

With the framework for importers nearly ready in this structure we can move forward with rewriting other importers like nvd, ubuntu etc. We can also, in parallel, start development for OvalDataSource and GitDataSource

GitHub externships

Philippe:

Mail sent to github for inquiry. We'll try to participate as an org.

Meeting on Thursday 2021-10-14 at 12:45 UTC

Agenda:

Review improvers
Hacktoberfest twitter VulnerableCode logo
Univers
GSoC guest invite - Why do open source

Participants:

Philippe (@pombredanne)
Hritik (@hritik14)

Review improvers

Do we want OSV Design in the database as well ? How to solve nginx multiple branch problem No, we'll carry on with the current model and use the qualifiers in PackageURL to specify the branches, in that way a package in a different branch will essentially be a different package and be displayed accordingly as a fix for that branch type.

Univers

More detailed info at https://github.com/nexB/univers/blob/386eb32468c75ecac25ec872ea004b3257962946/VERSION-RANGE-SPEC.rst

GSoC guest invite - Why do open source

Invitation accepted: https://www.youtube.com/watch?v=VNL8eO6Phj8

Meeting on Tuesday 2021-10-05 at 07:00 UTC

Agenda:

Hacktoberfest
VulnerableCode
Univers

Participants:

Philippe (@pombredanne)
Hritik (@hritik14)

Hacktoberfest

Hritik and Tushar will be setting up the ground for hacktoberfest. Looking for good-first-issues is mostly the task.

VulnerableCode

Nothing significant has changed since the last call. Hritik will carry on with the importer-improver framework.

Univers

Philippe is refactoring the codebase. Commits yet to be pushed.

Meeting on Wednesday 2021-09-18 at 10:45 UTC

Agenda:

Preview RTD before merging (local RTD via docker)
Merging new importer-improver model

Participants:

Philippe (@pombredanne)
Hritik (@hritik14)

Merging new importer-improver model

Hritik is going on a 2 week holiday thus the new model needs to be merged ASAP. We decided to achieve a checkpoint by today and push the latest version for Philippe to improve upon for the time being.

Preview RTD before merging

Pinged Ayan to look into it

Meeting on Friday 2021-09-17 at 09:00 UTC

Agenda:

Review improvers
Advisory data structure
Support http proxies, ticket <https://github.com/nexB/vulnerablecode/issues/559>_
Preview RTD before merging (local RTD via docker)

Participants:

Hritik (@hritik14)
Philippe (@pombredanne)

Review improvers / Advisory data structure

Batch processing of advisories needs to be avoided. Philippe:

each Improver has a method to process a single Advisory model instance such as Improver.get_inferences(self, advisory): -> Inference or something like that.

the framework then iterates on an Improver-provided query set such as Improver.get_interesting_advisories(self): -> QuerySet that has the Advisories it is interested in.

in the framework, there is an atomic transaction that updates both the Advisory (e.g. date and later a log of improvements with select for update) and whatever is updated or create from the Inference

More: https://github.com/nexB/vulnerablecode/pull/525#issuecomment-921722348

Also, some basic refactoring was discussed.

Support http proxies

This is not our prime objective right now. We could deal with it after a brief thought. Let's take this up in next session as well.

Meeting on Thursday 2021-09-14 at 14:00 UTC

Agenda:

Nginx version notations, ticket: https://github.com/nexB/vulnerablecode/issues/553
Review improvers
- VULCOIDS, ticket: https://github.com/nexB/vulnerablecode/issues/552
Naming: VersionSpecifer and VersionRange, ticket: https://github.com/nexB/univers/issues/8
Preview RTD before merging (local RTD via docker)

Participants:

Hritik (@hritik14)
Philippe (@pombredanne)

Nginx version notations

Hritik contacted Nginx over their mailing list to clarify their version notations like 1.21.0+. We got the following reply:

The 1.21.0+ notation means "1.21.0 and newer", or, more
formally, "1.21.0 and derived versions".  This includes all
future nginx versions on the mainline branch, and all future
stable branches (which aren't yet created).

More: https://mailman.nginx.org/pipermail/nginx/2021-September/061039.html

Further, according to Nginx <https://www.nginx.com/blog/nginx-1-16-1-17-released/>_

- Mainline is the active development branch where the latest features and bug fixes get added. It is denoted by an odd number in the second part of the version number, for example 1.17.
- Stable receives fixes for high‑severity bugs, but is not updated with new features. It is denoted by an even number in the second part of the version number, for example 1.16.0.

These information need to be accounted for in the nginx importer.

Review improvers - VULCOIDS

The function to check if there is an existing VULCOID for an advisory with an alloted CVE is missing. We need to implement that i.e. effectively move VULCOID to from vulnerability_id to old_vulnerability_id.

Naming: VersionSpecifer and VersionRange

The currently named VersionRange should be renamed to VersionConstraint and a VersionRange should be implemented to represent an actual range with an upper and lower bounds. The canonical string representation for this needs some discussion and an RFC needs to be drafted for the same. The current proposed representations are in the ticket <https://github.com/nexB/univers/issues/8>_.

Meeting on Tuesday 2021-09-07 at 14:00 UTC

Agenda:

Review improvers
Preview RTD before merging (local RTD via docker)

Review Improvers

Reviewed at: https://github.com/nexB/vulnerablecode/pull/525#pullrequestreview-745189344

Meeting on Thursday 2021-09-02 at 14:00 UTC

Agenda:

Review improvers
Consider adopting the OSV API as alternative output, ticket: https://github.com/nexB/vulnerablecode/issues/540
drf-spectacular vs redoc, ticket: https://github.com/nexB/vulnerablecode/pull/542
Decide on a uniform regular time for meetings
Preview RTD before merging (local RTD via docker)

Review improvers

Partial review done, more to come next day

Consider adopting the OSV API as alternative output

Marked as low priority. Let's do it at some later stage.

drf-spectacular vs redoc

redoc licensing is a huge mess due to the webpack (for eg: some licenses are not mentioned or enforced properly, http://tartarus.org/~martin/PorterStemmer/js.txt, feross). A detailed review needs to be done and all licenses should be mentioned and enforced explicitly. We should create a ticket about all unknown/unenforced packges and propagate that to upstream. Ticket: https://github.com/nexB/vulnerablecode/issues/549

Decide on a uniform regular time for meetings

Phlippe is comfortable with - before 3pm CET on Tuesday

Meeting on Tuesday 2021-08-17 at 14:00 UTC

Agenda:

Decide on the structure of improvers

Participants:

@Hritik14
@pombredanne
@sbs2001

Decide on the structure of improvers

The importers would now directly insert the Advisories into the database in the following format. Creating database relationships and populating Vulnerability, Packages, PackageRelatedVulnerabiliy etc would be the job of a default improver.

class Advisory(models.Model):
    date_published = models.DateField()
    date_collected = models.DateField()
    source = models.CharField()
    improved_on = models.DateTimeField()
    # data would contain a data_source.Advisory
    data = JSONField()


@dataclasses.dataclass(order=True)
class AdvisoryData:
    summary: str
    vulnerability_id: Optional[str] = None
    fix_packages: List[AffectedPackage] = dataclasses.field(default_factory=list)
    affected_packages: List[AffectedPackage] = dataclasses.field(default_factory=list)
    references: List[Reference] = dataclasses.field(default_factory=list)


@dataclasses.dataclass(order=True, frozen=True)
class AffectedPackage:
    # this package MUST NOT have a version
    package: PackageURL
    # the version specifier contains the version scheme as is: semver:>=1,3,4
    version_specifier: VersionSpecifier

There would be two kinds of improvers - Default improver - Populates the tables and creates concrete relationships mentioned above - Inference generating improvers - These improvers take metrics like time travel and create new relationships with respective confidence scores.

The Advisory model above must never be written by an improver (as of now). It would be a true upstream data that is populated by the importers and used by the improvers to generate appropriate relationships.

Meeting on Tuesday 2021-08-10 at 14:00 UTC

Agenda:

Deployment
Importer restructure
Hand written migrations (not generated by makemigrations, eg: package)
Dumping import_yielder
Any update on nix tests?

Deployment

Let's take this next week

Importer restructure

PackageRelatedVulnerability - Have data source - could be a text field of one message per line or a JSON field with list of strings showing where it came from - if a new information comes, add that information and then change that confidence, if a new advisory comes which is similar to an improvement then overwrite the improvement confidence to max - add confidence here - Higher confidence overrides lower confidence

Properly comment the fields in a dataclass

Hand written migrations (not generated by `makemigrations`, eg: package)

Not hand written, find in from packageurl.contrib.django.models import PackageURLMixin

Dumping import_yielder

Simply store data source as concrete lists
Don't store configuration or last run, maybe we can have better Log Importer Runs later. Ticket: https://github.com/nexB/vulnerablecode/issues/526

Any update on nix tests?

No

Meeting on Wednesday 2021-07-28 at 14:15 UTC

Participants:

@pombredanne
@Hritik14
@Divya

Agenda:

Refactor / Redesign importers to strictly import
Dump set_api from DataSource
- Knowing all the existing versions is not necessary to collect Advisories
- Strictly no inference
Revisit updated_advisories vs added_advisories
Consistent naming for fixed / vulnerable packages. No other package type should exist in a DataSource
Should a container be used instead of make postgres in scancodeIO ? - https://github.com/nexB/scancode.io/blob/4bebd7ae88ecaa00eb526bd831530c497903faf8/.github/workflows/ci.yml#L15-L28
Should Dockerfiles create virtualenvs as well ?
ScancodeIO uses FROM python:3.9 in Dockerflie - Docker upstream says: This is the defacto image. If you are unsure about what your needs are, you probably want to use this one. It is designed to be used both as a throw away container (mount your source code and start the container to start your app), as well as the base to build other images off of. - Docker upstream also recommends not to use -slim version unless there are space constraints - Docker images are reusable - Using a -slim requires to explicitly install build-essential, libpq-dev, git and in future svn - Most of the aforementioned packages come preinstalled in buster - Drastically increases Docker build time as now our image doesn't reuse already installed PYTHON image

Refactor / Redesign importers to strictly import

Dump set_api from DataSource: Yes. Separate them. We cannot not import a vulnerability when we don't have version info. We need advisories however we get. Clean organization of code in separate modules/functions should be there.

importer_yielder

Dump this. We can have data related to importer inside importers themselves. Problems with import_yielder: - Doesn't make sure importer class is actually a class - No type checking - This is a setting thing. It could appear as a Django setting (perhaps), which would enable/activate importers with fully qualified class names with module prefixes. Let's discuss this later in detail.

Revisit updated_advisories vs added_advisories

Do not return sets in both of the functions (TBD at later stage)
This should be handled by the framework before doing an update to the advisory. Something that could exist at the insertion time of importer (process_advisory).
Incremental update should be attribute of an specific importer which has that capability.
Importers have to return via one single function with the Advisory with some marks if they are new updated advisories. Importers with incremental supports will have different frequency of run, different entry points etc

Consistent naming for fixed / vulnerable packages

Inspired from https://tinyurl.com/vuln-json we'll use affects and fixed.

Should a container be used instead of `make postgres` in scancodeIO

Let's create an issue about control of versions (here postgres version) in both places. Then fix them at a later stage with similar fixes.

Should Dockerfiles create virtualenvs as well ?

Yes. Even ScancodeIO should do that. TODO: Create ticket : https://github.com/nexB/scancode.io/blob/4bebd7ae88ecaa00eb526bd831530c497903faf8/Dockerfile#L52

ScancodeIO uses `FROM python:3.9` in Dockerflie

In CI tests: Be explicit- 3.8.11-buster In Docker: Let's support 3.8 A test for docker-build and run

Meeting on Tuesday 2021-07-27 at 13:00 UTC

Participants:

@pombredanne
@Hritik14

Minutes:

Revert virtualenv caching in GitHub actions
- We don't want any surprises
- Ticket: https://github.com/nexB/vulnerablecode/issues/515
Introduce labels in Dockerfile
- We need to decide on which labels would be there.
- Needs to be done in all AbouteCode projects, currently verified absence in VulnerableCode, ScancodeIO
- Ticket: https://github.com/nexB/vulnerablecode/issues/513
TravisCI
- Get rid of webhooks
- Get rid of TravisCI totally
  - This is redundant after https://github.com/nexB/vulnerablecode/pull/295
Decide on python versions VulnerableCode supports.
- Currently move forward with 3.8.11-slim-buster
- Ticket: https://github.com/nexB/vulnerablecode/issues/512

Next Meeting agenda Wednesday 2021-07-21 at 13:00 UTC

Meeting on Wednesday 2021-07-21 at 13:00 UTC

Participants:

@pombredanne
@Hritik14
@sbs2001
@Divya

The proposed agenda is:

security review status
server / infrastructure
SVN for git tags: which approach or dep using
Zip file in testcases: https://github.com/nexB/vulnerablecode/pull/393#discussion_r668006209
Add API rate limiting and auth https://github.com/nexB/vulnerablecode/issues/460
Revert time traveling - how to go about it ?
In flight PR merge status
nix tests are failing: We should ping for help nix folks

security review status

All tickets are entered

Server / infrastructure

Delayed for now, will deploy on provisioned server. Philippe to work on it sometimes between this and next week
No more Django migration reset

SVN for git tags: which approach or dep using

using `svn --xml ls /tags/ should be enough

Zip file in testcases

These take a lot of space and cannot be diffed. Text files are better. See https://github.com/nexB/vulnerablecode/pull/393#discussion_r668006209 The code in rust and npm importers likely refactoring to use a different approach, but feel free to refactor.

Conclusion: tests and code refactoring may make the zip file usage moot.

Add API rate limiting and auth

https://github.com/nexB/vulnerablecode/issues/460
Using the DRF AnonRateThrottle should be good enough for a start

Revert time traveling - how to go about it ?

Definition: finding the set of versions that existed for a package at a point in the past when the advisory was published.

Advisory without correct/actionable package/version info
- goes into a review queue, or error log
- especially if we are missing which version of a package the bug is fixed in
Advisory with well defined set of impacted and fixed packages
- there no problem and we should import this alright
Advisory with only one of impacted or fixed packages, both well defined (e.g. no open ranges)
- we should import this alright
- Case1: only fix and no impacted:
  - do we want to time travel to find the version(s) that were impacted just before this?
  - there could be version marked as impacted and that's the one prior to the fixed version This may be often ambiguous... and this is inferred: should store this stating this is inferred/concluded and not a hard fact and we should have a log or queue entry so this issue can be revisited.
  - may be best done outside of importers proper, as some batch inference job See https://github.com/nexB/vulnerablecode/issues/500
- Case2: only impacted and no fix:
  - we should import this alright, this is normal
  - we should "time travel" and create relationships with the set of versions that matched the impacted closed/well defined range at
- it is OK to have no fix
- it is OK to have knowledge of when the bug was introduced:
  - do we want to time travel to fin
Advisory with impacted or fixed packages using open ranges

Conclusion of the likely best approach for version ranges

If we have concrete enumerated exact versions, then we can store Vuln<>Package relationship as a fact.
Otherwise this is some version range and its resolution is always an inference.
With time travel we provide a high degree of confidence in the inference, other the inference is weak.
All inferences may need some level of review (guided the confidence of the inference)
Each importer should have an explicit separation between the code to import and the code to resolve version ranges (and time travel).

TODO:

review usage of def test_foo(self, _) with an underscaore in tests See https://github.com/nexB/vulnerablecode/blob/05fcc64f2b48b186b7f742a2adf869b82293b024/vulnerabilities/tests/test_npm.py#L88
review #nopep8 and #NOQA tags
the npm importer design could be improved
we should add template for Github issues

Meeting on Thursday 2021-07-15 at 12:00 UTC

Participants:

@pombredanne
@Hritik14

The proposed agenda is:

documentation of new importers
alpine package versions
sorting imports
issue #494

These two items were not discussed:

security review status
server and infra

Documentation of new importers

we should progressively adopt the same documentation structure as scancode.io Docker files: we should progressively copy the same structure and approach as in scancode.io
Also README.rst needs to be beefed up and

Alpine package versions: notes and TODO

add license to alpine advisory
diff yaml and JSON from alpine secdb
consider using alpine distro main/arch/version in alpine purls
switch to using json fo alpine secbd instead of YAML
Most of https://github.com/nexB/vulnerablecode/pull/436 need to be revisited as we are now losing data
https://github.com/nexB/vulnerablecode/pull/449

TODO

create ticket to refactor importer_yielder.py
revisit time travel. It should not be part of importer but something that is about data improvement, inference and refinements
and something that is about data improvement, inference and refinements should be a separate tool

Pending PRs

Let's close the ones that are not from Hritik14

We pinged:

https://github.com/nexB/vulnerablecode/pull/391 : will need to be closed or finished by us
https://github.com/nexB/vulnerablecode/pull/408 : pinged
https://github.com/nexB/vulnerablecode/pull/400 : pinged
https://github.com/nexB/vulnerablecode/pull/464 : pinged

494 https://github.com/nexB/vulnerablecode/issues/494

This is also related to https://github.com/nexB/vulnerablecode/pull/436 Eventually we need a common way to obtain a range of version for a given importer (or rather for job that's fixing data after imports)

Meeting on Monday 2021-06-14 at 09:00 UTC

Participants:

Shivam @sbs2001
Philippe @pombredanne

Agenda:

Code review of https://github.com/nexB/vulnerablecode/pulls/467 (Shivam)
Integration for significant volume of vulnerability lookups (Shivam)
Getting to clean slate to avoid upcoming merge conflicts (Philippe)

code review of https://github.com/nexB/vulnerablecode/pulls/467

"Time travel" is to find about the range of affected package versions "at the time" of publication of an advisory.

Follow ups:

integration for significant volume of vulnerability lookups

To integrate VulnerableCode (VC) in ScanCode.io and other tools that can make intensive lookups on may packages, we would need to have a way to efficiently partially mirror VC data though API and reuse the VC models on the client application side (when based on the same stack like it is the case with ScanCode.io). Other things to consider:

using pub/sub or webhooks?

We need an issue for this. Design is needed

was not discussed.

MeetingMinutes2021

Meeting on Saturday 2021-12-04 at 10:30 UTC

Improver review

Speeding up the importer-improver structure migration

GitHub externships

Meeting on Thursday 2021-10-14 at 12:45 UTC

Review improvers

Univers

GSoC guest invite - Why do open source

Meeting on Tuesday 2021-10-05 at 07:00 UTC

Hacktoberfest

VulnerableCode

Univers

Meeting on Wednesday 2021-09-18 at 10:45 UTC

Merging new importer-improver model

Preview RTD before merging

Meeting on Friday 2021-09-17 at 09:00 UTC

Review improvers / Advisory data structure

Support http proxies

Meeting on Thursday 2021-09-14 at 14:00 UTC

Nginx version notations

Review improvers - VULCOIDS

Naming: VersionSpecifer and VersionRange

Meeting on Tuesday 2021-09-07 at 14:00 UTC

Review Improvers

Meeting on Thursday 2021-09-02 at 14:00 UTC

Review improvers

Consider adopting the OSV API as alternative output

drf-spectacular vs redoc

Decide on a uniform regular time for meetings

Meeting on Tuesday 2021-08-17 at 14:00 UTC

Decide on the structure of improvers

Meeting on Tuesday 2021-08-10 at 14:00 UTC

Deployment

Importer restructure

Hand written migrations (not generated by makemigrations, eg: package)

Dumping import_yielder

Any update on nix tests?

Meeting on Wednesday 2021-07-28 at 14:15 UTC

Refactor / Redesign importers to strictly import

importer_yielder

Revisit updated_advisories vs added_advisories

Consistent naming for fixed / vulnerable packages

Should a container be used instead of make postgres in scancodeIO

Should Dockerfiles create virtualenvs as well ?

ScancodeIO uses FROM python:3.9 in Dockerflie

Meeting on Tuesday 2021-07-27 at 13:00 UTC

Next Meeting agenda Wednesday 2021-07-21 at 13:00 UTC

Meeting on Wednesday 2021-07-21 at 13:00 UTC

security review status

Server / infrastructure

SVN for git tags: which approach or dep using

Zip file in testcases

Add API rate limiting and auth

Revert time traveling - how to go about it ?

Conclusion of the likely best approach for version ranges

TODO:

Meeting on Thursday 2021-07-15 at 12:00 UTC

Documentation of new importers

Alpine package versions: notes and TODO

TODO

Pending PRs

494 https://github.com/nexB/vulnerablecode/issues/494

Other topics:

Meeting on Monday 2021-06-14 at 09:00 UTC

code review of https://github.com/nexB/vulnerablecode/pulls/467

integration for significant volume of vulnerability lookups

Clone this wiki locally

Hand written migrations (not generated by `makemigrations`, eg: package)

Should a container be used instead of `make postgres` in scancodeIO

ScancodeIO uses `FROM python:3.9` in Dockerflie