-
Notifications
You must be signed in to change notification settings - Fork 644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Vulnerabilites]Add IVulnerabilityWriter #9712
Conversation
// Licensed under the Apache License, Version 2.0. See License.txt in the project root for license information. | ||
|
||
using Microsoft.Extensions.Logging; | ||
using NuGet.Services.Entities; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: order of dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved System above to be consistent with other files.
{ | ||
public interface IVulnerabilityWriter | ||
{ | ||
Task<bool> WriteVulnerabilityAsync(PackageVulnerability vulnerability, bool wasWithdrawn = false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add the description of an interface.
public class Advisory | ||
{ | ||
|
||
[JsonProperty(PropertyName = "url")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: empty line above.
public async void FlushAsync(string outputFileName = null) | ||
{ | ||
// This method is unused for this implementation. | ||
throw new NotImplementedException(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will this method be used for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method will be for when we aren't directly writing to the store location, which isn't the case for DB (DB implementation writes to DB on every call of WriteVulnerability).
The implementation planned for the v3 resource will internerally track all added vulnerabilities, then call Flush to write it out the final location (blob storage).
In short, IVulnerabilityWriter is treated sort of as a local cache of expected changes, and we call Flush to commit them. I didn't want to change DB writer behaviour, so this paradigm is ignored for DB implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: As mentioned in #9712 (comment) I have updated this to no-op instead.
src/GitHubVulnerabilities2Db/Job.cs
Outdated
@@ -66,6 +66,10 @@ protected void ConfigureIngestionServices(ContainerBuilder containerBuilder) | |||
.RegisterType<PackageVulnerabilitiesManagementService>() | |||
.As<IPackageVulnerabilitiesManagementService>(); | |||
|
|||
containerBuilder.RegisterType<GalleryDbVulnerabilityWriter>() | |||
.As<IVulnerabilityWriter>() | |||
.SingleInstance(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a little concerned about whether it is safe to use singleton here or whether there is a need to do so. Is there any consideration to use singleton?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The GalleryDbVulnerabilityWriter is a very simple passthrough implementation.
Since the PacakgeVulnerabilitiesManagementService above it that it passes through to is not single instance, we can probably drop it here. Good catch.
/// Commit changes to source | ||
/// </summary> | ||
/// <param name="outputFileName">Optional parameter to output to disk.</param> | ||
void FlushAsync(string outputFileName = null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
try | ||
{ | ||
await _packageVulnerabilityService.UpdateVulnerabilityAsync(vulnerability, wasWithdrawn); | ||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question. I'll look in to this to find out if there was a reason I made it return a bool in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't seem like I can think of a good reason that it returned bool here as it is unused. I'll wait on making changes for now to see if anybody else has any opinions on this, and then I'll change it.
} | ||
|
||
public async Task<int> WriteVulnerabilitiesAsync(IEnumerable<PackageVulnerability> vulnerabilities) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that this method is not called by 2Db job. Do we need the implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great point.
It is a trivial implementation, so we can probably drop it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After thinking about it a bit, I would like to leave this here in case the library changes more to call these methods on this interface. To this point, I have also updated the throwing implementation above to trivially do nothing and return.
|
||
namespace GitHubVulnerabilities2Db.Gallery | ||
{ | ||
public class GalleryDbVulnerabilityWriter : IVulnerabilityWriter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding test coverage for this class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test coverage was effectively already added. The existing tests call through this class now.
} | ||
catch (Exception ex) | ||
{ | ||
_logger.LogError("Failed to write vulnerability: {ErrorMessage}", ex.Message); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it is better to enrich this exception message, with the identity of a vulnerability like "GitHubDatabaseKey".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a new error logging line here that logs the github Advisory URL that failed.
|
||
namespace NuGet.Services.GitHub.Entities | ||
{ | ||
public class IndexEntry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one and "Advisory" are not used in this PR. This "IndexEntry" seems matching the data schema in vulnerability documents. Will it be better that we place these data entities next to the new 2Blob job, rather than put it in the shared layer that is also called by 2Db job?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. That's a good point. I'll pull them out of this PR. Might be a leftover from when Writer implementations were inside this library instead of the responsibility of the caller.
|
||
foreach(var vulnerability in vulnerabilities) | ||
{ | ||
if (await WriteVulnerabilityAsync(vulnerability)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the PackageVulnerability Entity that is used to pass around here doesn't contain that information. Let me see how feasible it is to create a wrapper class that splits into the NuGet.Entity class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the API to take a list of Tuples instead of the single PacakgeVulnerability IEnumerable.
This should enable us to pass through the wasWithdrawn.
@@ -4,6 +4,7 @@ | |||
using System; | |||
using System.Linq; | |||
using System.Threading.Tasks; | |||
using GitHubVulnerabilities2Db.Gallery; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: order of dependencies.
catch (Exception ex) | ||
{ | ||
_logger.LogError("Failed to write vulnerability {GitHubAdvisoryUrl}", vulnerability.AdvisoryUrl); | ||
_logger.LogError("Error Message: {ExceptionMessage}", ex.Message); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we making 2 separate error logs for same exception?
Usually we just log once per exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logging 2 lines splits the lines in AI. I figured that having both information was useful, but since we are rethrowing the exception below, we can probably skip the log of the Exception message as it should show up in logs as exception. I'll take it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add link to tracking issue and spec in description?
And give little more description of exactly what part of spec being implemented here.
I have included a link the Spec for Vulnerabilities in V3 to the description. |
{ | ||
var dbVulnerability = vulnerability.Item1; | ||
var wasWithdrawn = vulnerability.Item2; | ||
await WriteVulnerabilityAsync(dbVulnerability, wasWithdrawn); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation seems to be inserting vulnerabilities one by one. I have two concerns about that: if we are going to be inserting rows in SQL DB, inserting single row at a time is considerably slower than inserting a batch of rows in a single statement/transaction (it might not matter though if the number of rows inserted is low). Second is about failure recovery: what happens if it starts failing after writing some vulnerabilities?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You concern is totally valid, but this is how the current implementation works.
I think if it fails after writing some, it should be ok as worst case, it should just rewrite/modify the existing vulnerabilities.
But I think fixing this at this time is beyond the scope of this PR.
I am again making minimal changes to implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is actually not used anywhere, is it? If so, maybe just throw NotImplementedException()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the very least, please file a work item to update IPackageVulnerabilitiesManagementService
to allow batch updates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ryuyu
Have you filed work item for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public async Task FlushAsync(string outputFileName = null) | ||
{ | ||
// This method is unused for this implementation. | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe throwing NotImplementedException() is better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I debated this a lot internally, but with @agr 's suggestion at #9712 (comment), I think we will need a non throwing implementation.
Thanks for addressing the feedback! I suggest throwing NotImplementedException() for unused implementations. |
@@ -37,7 +37,7 @@ public async Task IngestAsync(IReadOnlyList<SecurityAdvisory> advisories) | |||
var vulnerabilityTuple = FromAdvisory(advisory); | |||
var vulnerability = vulnerabilityTuple.Item1; | |||
var wasWithdrawn = vulnerabilityTuple.Item2; | |||
await _packageVulnerabilityService.UpdateVulnerabilityAsync(vulnerability, wasWithdrawn); | |||
await _vulnerabilityWriter.WriteVulnerabilityAsync(vulnerability, wasWithdrawn); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should Flush
be called after all writes are done?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good call. I'll add a call to flush here after so we don't need to make this change later. Note @zhhyu that this change will require a non throwing implementation of flush for the GalleryDbWriter.
Update the Github Vulnerability interaction library to support IVulnerabilityWriters for writing vulnerabilities.
This is in preparation for use by the new v3 vulnerability resource updater.
Spec for Vulnerabilities in V3: https://github.com/NuGet/Engineering/pull/4940