Skip to content

Conversation

@mbg
Copy link
Member

@mbg mbg commented Oct 21, 2025

This PR is currently based on #3206, which should be merged first.

Conceptually, this PR follows on from #3097 to allow processed SARIF files to be written to disk, if required.

This is accomplished by:

  • Extracting the post-processing and uploading parts of uploadSpecifiedFiles into separate functions.
  • Replacing the call to uploadSpecifiedFiles in uploadSarif with calls to the new postProcessSarifFiles and uploadProcessedFiles.
  • Moving the UploadKind check into uploadSarif and making the call to uploadProcessedFiles conditional on it. Important: This is a change in behaviour, which means that we now always post-process SARIF files, even if upload is not always.
  • Adding a new post-process-output input to the analyze action which, if set, causes writeProcessedFiles to write the processed SARIF files to disk between the calls to postProcessSarifFiles and uploadProcessedFiles.

Risk assessment

For internal use only. Please select the risk level of this change:

  • High risk: Changes are not fully under feature flags, have limited visibility and/or cannot be tested outside of production.

Which use cases does this change impact?

  • Advanced setup - Impacts users who have custom workflows.
  • Default setup - Impacts users who use default setup.
  • Code Scanning - Impacts Code Scanning (i.e. analysis-kinds: code-scanning).
  • Code Quality - Impacts Code Quality (i.e. analysis-kinds: code-quality).
  • Third-party analyses - Impacts third-party analyses (i.e. upload-sarif).
  • GHES - Impacts GitHub Enterprise Server.

How did/will you validate this change?

  • Unit tests - I am depending on unit test coverage (i.e. tests in .test.ts files).
  • End-to-end tests - I am depending on PR checks (i.e. tests in pr-checks).

If something goes wrong after this change is released, what are the mitigation and rollback strategies?

  • Rollback - Change can only be disabled by rolling back the release or releasing a new version with a fix.

How will you know if something goes wrong after this change is released?

  • Telemetry - I rely on existing telemetry or have made changes to the telemetry.
    • Dashboards - I will watch relevant dashboards for issues after the release. Consider whether this requires this change to be released at a particular time rather than as part of a regular release.
    • Alerts - New or existing monitors will trip if something goes wrong with this change.

Merge / deployment checklist

  • Confirm this change is backwards compatible with existing workflows.
  • Consider adding a changelog entry for this change.
  • Confirm the readme and docs have been updated if necessary.

@mbg mbg changed the title Refactor SARIF post-processing Perform SARIF post-processing independently of upload Oct 21, 2025
@mbg mbg force-pushed the mbg/upload-lib/post-process branch from d50f5d8 to a457f8e Compare October 22, 2025 00:14
@github-actions github-actions bot added the size/L May be hard to review label Oct 22, 2025
@mbg mbg marked this pull request as ready for review October 22, 2025 00:22
@mbg mbg requested a review from a team as a code owner October 22, 2025 00:22
@Copilot Copilot AI review requested due to automatic review settings October 22, 2025 00:22
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the SARIF upload process to separate post-processing from uploading, allowing processed SARIF files to be written to disk independently of whether they are uploaded. The key change is that SARIF files are now always post-processed regardless of the upload input value, with the actual upload being conditional on uploadKind === "always". A new post-process-output input enables users to specify a directory for writing processed SARIF files.

Key Changes

  • Extracted post-processing logic from uploadSpecifiedFiles into separate postProcessSarifFiles and uploadProcessedFiles functions
  • Modified uploadSarif to conditionally upload based on uploadKind parameter while always performing post-processing
  • Added new post-process-output input to the analyze action for specifying where to write processed SARIF files

Reviewed Changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/upload-sarif.ts Main logic change: splits processing and uploading, adds conditional upload based on uploadKind
src/upload-lib.ts Refactors uploadSpecifiedFiles into postProcessSarifFiles, writeProcessedFiles, and uploadProcessedFiles
src/upload-sarif.test.ts Updates tests to mock new function signatures and adds tests for new behavior
src/analyze-action.ts Updates to pass uploadKind and post-process-output parameters to uploadSarif
src/upload-sarif-action.ts Updates to pass uploadKind parameter (hardcoded to "always")
analyze/action.yml Adds documentation for new post-process-output input parameter
pr-checks/checks/quality-queries.yml Adds test case for the new post-process-output functionality
lib/*.js Generated JavaScript files mirroring TypeScript changes

@esbena esbena self-requested a review October 22, 2025 09:17
Copy link
Contributor

@esbena esbena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for splitting into individual commits.
I only have some wording comments, none are hard blockers.

description: Whether to upload the resulting CodeQL database
required: false
default: "true"
post-process-output:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the order of these input declarations chosen strongly based on discoverability for the user? If not, then I'd like this to be right next to the output input declared near the top in the name of coherence.

Second, and I'm sorry if this naming has been discussed at length in related PRs earlier: post-process-output reads like a boolean toggle, not a path. post-processed-output sounds more like a path. Additionally we always do the post-processing regardless of wether we save them to disk..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

post-processed-output sounds more like a path.

I agree this would be a better name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second, and I'm sorry if this naming has been discussed at length in related PRs earlier: post-process-output reads like a boolean toggle, not a path. post-processed-output sounds more like a path. Additionally we always do the post-processing regardless of wether we save them to disk..

Ironically, I think this kind of bad naming is actually something I complained about on @redsun82's PR previously, which he then changed to something better. I agree that this isn't a good name and have changed it to processed-sarif-path for now. That makes it consistent with other path inputs (other than output here). I don't think think that post-processed-output is really any better than post-process-output.

Is the order of these input declarations chosen strongly based on discoverability for the user? If not, then I'd like this to be right next to the output input declared near the top in the name of coherence.

Generally, I think the order is roughly based on how relevant the input is to a user. (See e.g. that we have the ones that are automatically populated or used internally only at the bottom.)

Copy link
Contributor

@henrymercer henrymercer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the classification as high risk, have you considered manually comparing CodeQL and post-processed SARIF output before/after this PR to make sure there are no changes?

description: Whether to upload the resulting CodeQL database
required: false
default: "true"
post-process-output:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

post-processed-output sounds more like a path.

I agree this would be a better name.

Base automatically changed from mbg/analyze/use-upload-sarif to main October 22, 2025 16:45
@mbg mbg force-pushed the mbg/upload-lib/post-process branch from a457f8e to 8ff870a Compare October 22, 2025 18:17
@mbg mbg requested a review from henrymercer October 24, 2025 09:31
henrymercer
henrymercer previously approved these changes Oct 24, 2025
@mbg mbg enabled auto-merge October 24, 2025 13:46
@mbg mbg merged commit d75645b into main Oct 24, 2025
243 checks passed
@mbg mbg deleted the mbg/upload-lib/post-process branch October 24, 2025 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/L May be hard to review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants