Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add git metadata to monitor payload #462

Merged
merged 1 commit into from
May 7, 2019

Conversation

odinn1984
Copy link
Contributor

  • Ready for review
  • Follows CONTRIBUTING rules
  • Reviewed by Snyk internal team

What does this PR do?

We are adding git metadata to monitor when sending the payload to snyk.io for grouping CLI projects

};
}

function findGitRoot(): string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please consider git rev-parse --show-toplevel

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

analytics.add('packageManager', packageManager);

const gitTarget = await projectMetadata.getInfo('git', root, targetFile) as GitInfo;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this targetfile ever be different to the https://github.com/snyk/snyk/pull/462/files#diff-e2e0f6cf5732ba336dc3ba0de17258faR88?

Will it matter? Not all plugins send targetFile and some choose what is best to process in the end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lili2311 the difference here would be that this is the relative path to the target file (relative to the git project root) and we pass it to monitor later in the chain. i think it would make more sense to present the file in this way in snyk.io. wdyt? basically meaning that if u have a monorepo for example we will save the target file as the relative path from the top level dir to the file even if ur running it inside of a sub dir

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when a user runs a snyk test with a file ../../somefile? After checking analytics this happens today!

@odinn1984 odinn1984 force-pushed the feat/send-gitinfo-on-monitor branch from 54d5719 to 803a28f Compare April 30, 2019 14:19
gjvis
gjvis previously requested changes Apr 30, 2019
Copy link
Contributor

@gjvis gjvis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See inline, plus one general comment:

whatever we do here needs to be very easy to work into test and protect as well, as this data will be used very soon as part of the "locate the relevant project" code server-side

@@ -0,0 +1,7 @@
export interface GitInfo {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It think this should be a GitTarget, and should not include commitSha or targetFile, as those aren't part of the identify where the canonical place the source code lives.

@@ -0,0 +1,7 @@
export interface GitInfo {
repo: string;
owner: string;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

owner is a GitHub specific thing, git itself does not have this concept.

@@ -0,0 +1,7 @@
export interface GitInfo {
repo: string;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps remoteUrl would be better?

also, we need to normalise this, e.g. always using the ssh url, so maybe "remoteSSHUrl` or similar (depending on how we choose to normalise it)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we do a remote url (which was origin URL in michael's original PR) it means that we're leaving the parsing part to registry to build the target... or we can just use the URL somehow to represent the project at registry?

import * as gitMetadata from './git';
import { GitInfo } from './types';

export async function getInfo(metadataSource: string, root: string, targetFile: string): Promise<GitInfo|null> {
Copy link
Contributor

@gjvis gjvis Apr 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I think this needs to return just the target, and potentially some monitor meta (although that happens elsewhere already, right?)

  2. Its weird that this component has to be told what the metadataSource is, surely this needs to happen before we get to the command and work out based on the CLI args, the env, the process.pwd(), etc. what the metadataSource and target are?

  3. Related to 2, how would we work command-line based overrides into this structure? E.g. a git remote-url specified on the command line would still need to be normalised, which is going to end up in a completely different place to this? If the broad boundary is "CLI-args, env, process.pwd() as input -> target as output" then command line overrides would be possible too

  4. We also need to decide how we cope with situations where the "root" directory here changes based on us being able to inspect the git repo metadata, right? That change (and any resulting impact on any --file or similar argument) needs to be propagated to the plugin somehow

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. yes the monitor is responsible for adding the monitor data, I wanted this factory to return only target related data
    2 and 3. so would it make sense to do something like... does a git command work then it's git if not try to see if we have a docker argument etc?
  2. if the root dir changes that's fine no since we're just using it to get the relative path to the target file?

analytics.add('packageManager', packageManager);

const gitTarget = await projectMetadata.getInfo('git', root, targetFile) as GitInfo;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see other comment, but commenting here too: how does this know already that this is git?

Copy link
Contributor

@gjvis gjvis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting to put this after the plugins have done their thing, seems a better direction.

Another question: should we make sure that the targetFile we're passing up is relative to the root directory of the git repo, if there is one?

(@lili2311 your thoughts on this seem relevant too given your familiarity with the CLI and targetFiles, etc.?).

@@ -74,6 +81,7 @@ export function monitor(root, meta, info: SingleDepRootResult): Promise<any> {
dockerImageId: pluginMeta.dockerImageId,
dockerBaseImage: pkg.docker ? pkg.docker.baseImage : undefined,
projectName: meta['project-name'],
projectTarget,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be alongside targetFile, in the top-level body, and called plain old target (to closely match the agent-side of things)

user: string;
project: string;
branch: string;
}
Copy link
Contributor

@gjvis gjvis May 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is still too hosted-git centric. A good test case (direct from https://git-scm.com/) is user@server:path/to/repo.git -> that should be easy to unambiguously represent in this GitTarget type

noticed that the hosted-git-info only supports things cloned from GitHub, GitLab, and Bitbucket so its not really appropriate here. Perhaps https://www.npmjs.com/package/git-url-parse is better? Or something else (I only searched for 2 minutes... 😂)

Maybe an https representation generated by git-url-parse is the most appropriate? there no perfect right answer here :'(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regarding your other comment about registry needing to do processing - I don't follow? As long as the targets are both standardised and hold the git upstream repo details then we can both group by targets and support the future desire to match up CLI with hosted source code projects in some way.

what am I missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that's a good point... I think I was overcomplicating things here because my head was too hosted solution oriented...

return null;
} else if ((await subProcess.execute('git', ['remote', 'get-url', 'origin'])).trim()) {
return gitMetadata.getInfo();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is crossing into the specific-code territory, but while i'm here:

const TARGET_BUILDER = [gitMetadata, dockerImageMetadata];

// ...
for (const targetBuilder in TARGET_BUILDERS) {
  const target = targetBuilder.getInfo(packageInfo);
  if (target) {
    return target;
  }
}

an implementation like that lets you add new target builders (or some other term, not precious about the name) easily.

They return a target, or undefined if they can't, and you get to keep all target-specific knowledge out of this file, its responsibility becomes simply the registration and order-of-preference for each target builder.

@gjvis gjvis dismissed their stale review May 1, 2019 17:14

Comments addressed, discussion has moved on

@lili2311
Copy link
Contributor

lili2311 commented May 2, 2019

@gjvis @odinn1984

The plugin will always return a file that was used in the end to generate the Tree, the plugin has better understanding to do this.

However not all plugins return it relevant to the git root of course.

I think if we for now send what you propose we can then check the data for discrepancies. For example some users run test via snyk test ../../somefile What will this translate to in the git target?

Do we have a test for this?

@odinn1984
Copy link
Contributor Author

@lili2311 this would be translated to the relative path from the root if we chose to send that, the question here is if it's useful to send that as well.

We're not in the tests stage yet, I will add tests once all the questions are answered :)

@odinn1984 odinn1984 force-pushed the feat/send-gitinfo-on-monitor branch 3 times, most recently from 62dc4d2 to c7920d3 Compare May 6, 2019 13:14
import subProcess = require('../../sub-process');
import { GitTarget } from '../types';

/* tslint:disable:no-unused-variable */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this necessary?

branch,
};
} catch (err) {
return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we handle it differently? maybe log reason for erroring

@odinn1984 odinn1984 closed this May 6, 2019
@odinn1984 odinn1984 reopened this May 6, 2019
@odinn1984 odinn1984 closed this May 6, 2019
@odinn1984 odinn1984 reopened this May 6, 2019
@odinn1984 odinn1984 force-pushed the feat/send-gitinfo-on-monitor branch 4 times, most recently from ef74bd0 to 368329f Compare May 7, 2019 07:40
@odinn1984 odinn1984 requested a review from gjvis May 7, 2019 08:22
@odinn1984 odinn1984 force-pushed the feat/send-gitinfo-on-monitor branch from 368329f to 868a907 Compare May 7, 2019 12:45
@@ -52,8 +56,12 @@ export function monitor(root, meta, info: SingleDepRootResult): Promise<any> {
return snyk.policy.create();
}
return snyk.policy.load(policyLocations, opts);
}).then((policy) => {
}).then(async (policy) => {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why add async

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is returning a promise not using async await

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonathan-fielding if you look a few lines below I'm using await there, an async function is a promise behind the scenes so this still holds true and works fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants