Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

View for OpenScanHub results #441

Closed
2 tasks done
Tracked by #2516
lachmanfrantisek opened this issue Aug 21, 2024 · 23 comments
Closed
2 tasks done
Tracked by #2516

View for OpenScanHub results #441

lachmanfrantisek opened this issue Aug 21, 2024 · 23 comments
Assignees
Labels
area/fedora Related to Fedora ecosystem gain/high This brings a lot of value to (not strictly a lot of) users. impact/high This issue impacts multiple/lot of users. kind/feature New feature or a request for enhancement.

Comments

@lachmanfrantisek
Copy link
Member

lachmanfrantisek commented Aug 21, 2024

It would be nice to present people results from the analysis including the relation to other related jobs (like SRPM/RPM) build.

This would most probably require a reporting mechanism to be implemented.
(Currently, Packit only triggers the scan.)

We might also want to think about this as part of Project Mycorrhisa, ping @Venefilyn .

todo

  • add needed information to the service API
  • implement a basic view for the scan
@lachmanfrantisek
Copy link
Member Author

We also need to think about what needs to be displayed on Packit side and what (if) needs to be on OSH only.

@siteshwar
Copy link

siteshwar commented Aug 21, 2024

As mentioned by @kdudka in packit/packit#2371 (reply in thread), this needs to be implemented on the Packit side.

As per my understanding, the workflow should look like:

  • Packit submits the scan to OSH, and waits for the scan to finish.
  • If the scan was successful, download added.js and convert it to SARIF through csgrep --mode=sarif.
  • Upload the SARIF output to CodeQL, or a similar service that can present the results in a user friendly way.
  • Link the user to the report through GitHub actions, or similar CI job.

@davidmalcolm Let us know if you have any suggestions here. Thanks!

EDIT: Fixed name of .js file to download.

@siteshwar
Copy link

We might also want to think about this as part of Project Mycorrhisa, ping @Venefilyn .

As I understand, it is Project Mycorrhiza and its a terrible name.

@lachmanfrantisek
Copy link
Member Author

As I understand, it is Project Mycorrhiza and its a terrible name.

I like the meaning but always struggle with the spelling..;-) Btw. we are collecting use cases people want to see help with here: https://packit.limesurvey.net/project-mycorrhiza

@lachmanfrantisek
Copy link
Member Author

@siteshwar thanks for the task list.

Do you (or anyone else) know if there is a way to get HTML or some other nice visualisation/representation so we can integrate it directly instead of linking an external service? (People do not like these clicking workflows.)

@lachmanfrantisek
Copy link
Member Author

  • Packit submits the scan to OSH, and waits for the scan to finish.

@siteshwar btw, for this, we would appreciate messages coming to the Fedora message bus to avoid busy-wait which is really hard to do with the the current Packit's event-driven architecture. Do you think it's realistic to do?

Related issues about this:

@mfocko mfocko added area/fedora Related to Fedora ecosystem gain/high This brings a lot of value to (not strictly a lot of) users. impact/high This issue impacts multiple/lot of users. kind/feature New feature or a request for enhancement. labels Aug 22, 2024
@mfocko mfocko moved this from new to backlog in Packit Kanban Board Aug 22, 2024
@siteshwar
Copy link

@siteshwar btw, for this, we would appreciate messages coming to the Fedora message bus to avoid busy-wait which is really hard to do with the the current Packit's event-driven architecture. Do you think it's realistic to do?

It is in my todo list. I would get back to it at some point in future.

@davidmalcolm
Copy link

As mentioned by @kdudka in packit/packit#2371 (reply in thread), this needs to be implemented on the Packit side.

As per my understanding, the workflow should look like:

* Packit submits the scan to OSH, and waits for the scan to finish.

* If the scan was successful, download `scan-results.js` and convert it to SARIF through `csgrep --mode=sarif`.

* Upload the SARIF output to CodeQL, or a similar service that can present the results in a user friendly way.

* Link the user to the report through GitHub actions, or similar CI job.

@davidmalcolm Let us know if you have any suggestions here. Thanks!

@siteshwar I confess my knowledge here is mostly on the diagnostics and SARIF generation side, less on filtering, presentation, etc. I haven't played around with CodeQL yet; the pertinent docs would appear to be here. I notice that they are using a few non-standard properties via property bags. FWIW I regularly meet a CodeQL representative at the monthly OASIS SARIF TC meetings.

I'm nervous that going from, say GCC SARIF output to scan-results.js then back to SARIF might lose information. Ideally we'd use SARIF throughout the process to keep as much info as possible (perhaps using property bags if there's anything that OSH wants to capture that isn't yet expressible in SARIF). I'm familiar with a large subset of SARIF 2.1.0, so let me know if you have any questions about it.

@davidmalcolm
Copy link

@siteshwar Also, please note that (with my upstream GCC hat on), I regard GCC's "json" diagnostic output format to be "legacy"; all of my future development for machine-readable diagnostics is on its SARIF output format instead. See https://gcc.gnu.org/wiki/SARIF for details on GCC's SARIF support, bug tracking, etc.

@kdudka
Copy link

kdudka commented Aug 23, 2024

@davidmalcolm OSH still captures the plain-text format of GCC diagnostics. Although csdiff supports all the three formats (plain-text, JSON, SARIF), it is not easy to migrate csmock-plugin-gcc to the newer diagnostic formats. As it is implemented now, the diagnostic output from all GCC processes is written to a single text file. This works well for the plain-text format but not for the JSON-based formats because concatenation of multiple JSON files is not a valid JSON.

Can GCC write a separate SARIF file to a specified directory for each invocation? Something like valgrind --xml-file=${dir}/%p-%n.xml ...? I think something like that would ease the transition of OSH to GCC's SARIF output...

@davidmalcolm
Copy link

Can GCC write a separate SARIF file to a specified directory for each invocation? Something like valgrind --xml-file=${dir}/%p-%n.xml ...? I think something like that would ease the transition of OSH to GCC's SARIF output...

@kdudka Sadly not. I had thought that it respected the dumpfile command-line arguments, but it turns out it doesn't. Sorry about that. See GCC bug 110522 for more info, and ideas on fixing this.

Would being able to specify an output filename for the .sarif file on the gcc command line be enough?

BTW, which version of GCC are you using?

@siteshwar
Copy link

siteshwar commented Aug 28, 2024

BTW, which version of GCC are you using?

The scans that are run through Packit are run on rawhide, so we use the same version of gcc as available in rawhide. Currently it is 14.1.1-7.fc41, it can be actually seen in the stdout.log of any scan.

@kdudka
Copy link

kdudka commented Aug 28, 2024

Would being able to specify an output filename for the .sarif file on the gcc command line be enough?

Yes. If GCC was able to write a SARIF file with a unique name in a specified directory (like valgrind --xml-file=${dir}/%p-%n.xml ... does), it would be much easier to build automation on top of it. Nevertheless, if GCC is able to write the SARIF file to a specified path, we should be able to script around it with flock and mktemp.

Also does GCC always provide absolute paths to source code files in the SARIF data? Relative paths to source code files are currently translated by the compiler wrapper that processes the diagnostic messages printed by GCC to stderr.

BTW, which version of GCC are you using?

OSH uses the GCC version that is available in the buildroot. So, as @siteshwar says, it will be the Fedora Rawhide version by default for OSH tasks triggered by Packit. OSH needs to work with GCC versions available in EPEL buildroots, too, but csmock-plugin-gcc can probe for the needed features in GCC and eventually fallback to the plain-text capture that it uses today.

csutils/csmock#41 introduced an option to use a custom version of GCC Analyzer, which can be different from the GCC version that drives the native build (as long as these GCC versions can be installed in parallel into a single buildroot).

@siteshwar
Copy link

  • Upload the SARIF output to CodeQL, or a similar service that can present the results in a user friendly way.

There is a limit of 10 MB on the results file. The SARIF file should be compressed before being uploaded.

@Venefilyn
Copy link
Collaborator

Venefilyn commented Sep 20, 2024

  • Upload the SARIF output to CodeQL, or a similar service that can present the results in a user friendly way.

I think outputting it with CodeQL would be great to keep things within GitHub. As for the dashboard we could make use of the SARIF output as well, for that I ask that packit-service can indicate in a response that a SARIF file exists or is being generated, and create a new endpoint to return the SARIF results

Given SARIF has a schema it should be relatively trivial to handle it in the dashboard as we can just install the types or generate the types to make things easier for us through something like https://www.npmjs.com/package/@types/sarif/v/2.1.5

We might also want to think about this as part of Project Mycorrhisa, ping @Venefilyn .

As I understand, it is Project Mycorrhiza and its a terrible name.

It's a project/WG name for an idea! It's a good name 😉

@lachmanfrantisek lachmanfrantisek moved this from backlog to priority-backlog in Packit Kanban Board Oct 10, 2024
@lachmanfrantisek lachmanfrantisek moved this from priority-backlog to refined in Packit Kanban Board Oct 10, 2024
@lbarcziova lbarcziova self-assigned this Oct 28, 2024
@lbarcziova lbarcziova moved this from refined to in-progress in Packit Kanban Board Oct 28, 2024
@lbarcziova
Copy link
Member

I have started looking into the implementation details of this and have a few things to discuss/confirm first:

  • is it ok if we focus on and display only the newly found defects, i.e. added.js? (there is also the general results file and fixed defects)
  • before implementing the conversion to SARIF on our (Packit) side, I wanted to confirm whether this functionality doesn't generally make more sense to be in OSH directly, could other OSH users benefit from this as well?
  • displaying of the SARIF on Packit's side: I agree with @Venefilyn point, Packit isn't serving GitHub only (even though most of our users are here), so it would be nice to have a general solution for displaying the SARIF on our dashboard, I haven't looked into the implementation details of that yet. But I wanted to confirm we want to have 2 ways of displaying the results - both on dashboard and in GitHub UI. GitHub's functionality for this looks really nice and for GitHub users would be great, from what I checked we would upload the SARIF using this endpoint and for that we would need to update permissions for our app regarding Code scanning alerts (see docs).

@siteshwar
Copy link

I have started looking into the implementation details of this and have a few things to discuss/confirm first:

  • is it ok if we focus on and display only the newly found defects, i.e. added.js? (there is also the general results file and fixed defects)

Yes, we should just point out new findings.

  • before implementing the conversion to SARIF on our (Packit) side, I wanted to confirm whether this functionality doesn't generally make more sense to be in OSH directly, could other OSH users benefit from this as well?

@kdudka Shall this be done from OSH side?

  • displaying of the SARIF on Packit's side: I agree with @Venefilyn point, Packit isn't serving GitHub only (even though most of our users are here), so it would be nice to have a general solution for displaying the SARIF on our dashboard, I haven't looked into the implementation details of that yet. But I wanted to confirm we want to have 2 ways of displaying the results - both on dashboard and in GitHub UI. GitHub's functionality for this looks really nice and for GitHub users would be great, from what I checked we would upload the SARIF using this endpoint and for that we would need to update permissions for our app regarding Code scanning alerts (see docs).

I am fine with using Packit dashboard, as long as it enforces an action from users on new findings.

@lbarcziova lbarcziova added the discuss discuss To be discussed within a team (usually on the so-called Architecture meeting next Thursday) label Oct 30, 2024
@lbarcziova
Copy link
Member

Outcomes from our arch meeting: we will start for now with just a simple view in our dashboard (=> no need to store the SARIF at our DB for now) and try to integrate the GitHub SARIF support. GitLab doesn't seem to have something similarly straightforward, but I will have a look into possible alternatives.

@lbarcziova lbarcziova removed the discuss discuss To be discussed within a team (usually on the so-called Architecture meeting next Thursday) label Oct 31, 2024
lbarcziova added a commit to lbarcziova/packit-service that referenced this issue Oct 31, 2024
@kdudka
Copy link

kdudka commented Nov 5, 2024

before implementing the conversion to SARIF on our (Packit) side, I wanted to confirm whether this functionality doesn't generally make more sense to be in OSH directly, could other OSH users benefit from this as well?

@kdudka Shall this be done from OSH side?

@siteshwar This is being discussed repeatedly. Extending csmock to (optionally) produce a SARIF file for each scan is trivial but it will significantly increase the storage size needed for scan results. Another solution would be to produce SARIF files per request dynamically server-side, either from the simple JSON-encoded scan results, or from compressed SARIF files.

lbarcziova added a commit to lbarcziova/packit-service that referenced this issue Nov 7, 2024
lbarcziova added a commit to lbarcziova/packit-service that referenced this issue Nov 8, 2024
lbarcziova added a commit to lbarcziova/dashboard that referenced this issue Nov 11, 2024
github-merge-queue bot pushed a commit that referenced this issue Nov 11, 2024
@Venefilyn
Copy link
Collaborator

For the osh API, can we get access-control-allow-origin enabled so we can embed the results?

@lbarcziova
Copy link
Member

Outcomes from our today's sync with @lachmanfrantisek @siteshwar and @kdudka: uploading SARIFs to GitHub currently doesn't provide the additional benefit of code annotation as SRPM and RPM building changes the source code so the mapping back to get from the analysis result code placement to the source code would need to be implemented, which seems like a non-trivial thing, an example of this (using code from packit/packit-service#2621):

Snímka obrazovky z 2024-11-07 11-48-06

With packit/packit-service#2622 we have improved the reporting now so that we report directly in GitHub whether/how many of new defects were identified. There is also a view in our dashboard now.

mrc0mmand added a commit to mrc0mmand/polkit that referenced this issue Nov 13, 2024
Let's do a Packit build on "commit" events as well, which automagically
enables Packit's OpenScanHub integration [0].

Since OSH runs only on x86_64 Fedora Rawhide builds, let's limit the
scope for the "commit" event adequately; we still do the usual builds
across all supported Fedora releases (and architectures) in PRs.

The OSH integration is still in its experimental stage, but there's a
plan to add SARIF support and integrate it into GitHub's UI (see [1]),
which should make it quite useful.

[0] https://packit.dev/posts/openscanhub-prototype
[1] packit/dashboard#441
mrc0mmand added a commit to mrc0mmand/polkit that referenced this issue Nov 13, 2024
Let's do a Packit build on "commit" events as well, which automagically
enables Packit's OpenScanHub integration [0].

Since OSH runs only on x86_64 Fedora Rawhide builds, let's limit the
scope for the "commit" event adequately; we still do the usual builds
across all supported Fedora releases (and architectures) in PRs.

The OSH integration is still in its experimental stage, but there's a
plan to add SARIF support and integrate it into GitHub's UI (see [1]),
which should make it quite useful.

[0] https://packit.dev/posts/openscanhub-prototype
[1] packit/dashboard#441
jrybar-rh pushed a commit to polkit-org/polkit that referenced this issue Nov 14, 2024
Let's do a Packit build on "commit" events as well, which automagically
enables Packit's OpenScanHub integration [0].

Since OSH runs only on x86_64 Fedora Rawhide builds, let's limit the
scope for the "commit" event adequately; we still do the usual builds
across all supported Fedora releases (and architectures) in PRs.

The OSH integration is still in its experimental stage, but there's a
plan to add SARIF support and integrate it into GitHub's UI (see [1]),
which should make it quite useful.

[0] https://packit.dev/posts/openscanhub-prototype
[1] packit/dashboard#441
@Venefilyn
Copy link
Collaborator

For the osh API, can we get access-control-allow-origin enabled so we can embed the results?

Also for Packit service it would be nice to also return created, started, and finished times as well and not just submitted time

@lbarcziova
Copy link
Member

@Venefilyn yes those are good areas for improvement, since this issue was meant to cover the initial version, I am going to close it (with regards to #441 (comment)) and we can create follow-up smaller ones as needed.

@github-project-automation github-project-automation bot moved this from in-progress to done in Packit Kanban Board Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/fedora Related to Fedora ecosystem gain/high This brings a lot of value to (not strictly a lot of) users. impact/high This issue impacts multiple/lot of users. kind/feature New feature or a request for enhancement.
Projects
Archived in project
Development

No branches or pull requests

7 participants