-
Notifications
You must be signed in to change notification settings - Fork 254
Description
🏥 CI Failure Investigation - Run #22185894213
Summary
Integration: Workflow Permissionscompleted its tests butactions/upload-artifactcould not finalizetest-result-integration-Workflow Permissionsand the JSON artifact never landed because Azure returned HTTP 403.canary_gorelies on those integration artifacts, soscripts/compare-test-coverage.shreported five Workflow Permissions tests as unexecuted and failed with exit code 1 once the artifact went missing.
Failure Details
- Run: 22185894213
- Commit: 1b5447d
- Trigger: push
Root Cause Analysis
Integration: Workflow Permissions uploaded test-result-integration-Workflow Permissions but the final actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 step failed with Failed to FinalizeArtifact: Received non-retryable error: Failed request: (403) Forbidden. Because the artifact never became available, canary_go’s coverage comparator saw the five Workflow Permissions tests listed in all-tests.txt but not in executed-tests.txt and aborted even though the tests themselves had already passed.
Reproduction Steps
- Push any change so the CI run exercises
Integration: Workflow Permissionsand the downstreamcanary_gocoverage check. - Observe the upload of
test-result-integration-Workflow Permissions(test-result-integration-*.json) succeed but then hit HTTP 403 when finalizing the artifact. canary_gorunsscripts/compare-test-coverage.sh all-tests.txt executed-tests.txt, sees the five Workflow Permissions tests missing, and fails with❌ FAILURE: Found 5 tests that are NOT being executed in CIplus##[error]Process completed with exit code 1.
Failed Jobs and Errors
- Integration: Workflow Permissions –
actions/upload-artifactstep fortest-result-integration-Workflow Permissionsended withFailed to FinalizeArtifact: Received non-retryable error: Failed request: (403) Forbiddenafter the ZIP upload finished. - canary_go –
scripts/compare-test-coverage.sh all-tests.txt executed-tests.txtlistedTestCollectPackagesFromWorkflow,TestPermissionsImportIntegration,TestPermissionsShortcutInIncludedFiles,TestPermissionsShortcutMixedUsage, andTestPermissionsWarningInNonStrictModeas missing because the integration artifact was never present, and the job exited 1.
Investigation Findings
- The integration job itself completed its tests and uploaded a single
test-result-integration-*.jsonfile, so the failure only occurs in theactions/upload-artifactfinalization step. - The missing artifact is the sole reason
canary_gosees tests as unexecuted; the coverage script comparesall-tests.txtagainst downstream JSON artifacts and treats absence as a test failure. actions/upload-artifact403s have happened before ([CI Failure Doctor] CI Failure Investigation - Run #22103122541 #16377), and coverage noise from missing artifacts is already documented ([CI Failure Doctor] CI Failure Investigation - Run #35768 #15789), so both symptoms are recurring patterns in this workflow.
Recommended Actions
- Re-run the workflow to see if the
Failed to FinalizeArtifact403 was transient; if it recurs, wrap the upload step in retries or a custom uploader so the integration artifact is guaranteed to be committed before downstream jobs start. - Teach
scripts/compare-test-coverage.sh(and any other coverage check) to detect when expected artifacts are absent and fail fast with a clear message rather than listing every missing test. - Surface infrastructure-level failures for
actions/upload-artifact(especially HTTP 403s) so the next coverage run can skip or short-circuit instead of depending on a missing artifact.
Prevention Strategies
- Add instrumentation or retries around Azure artifact finalization to recover from intermittent 403 responses and avoid leaving downstream jobs without inputs.
- Guard coverage comparators against missing integration artifacts so they raise an explicit “artifact not found” error instead of bloating the log with missing-test lists.
AI Team Self-Improvement
Always treat Failed to FinalizeArtifact 403 responses as infrastructure failures and avoid running dependent coverage comparisons until the artifact is confirmed present.
Historical Context
- Issue [CI Failure Doctor] CI Failure Investigation - Run #22103122541 #16377 (Run #22103122541) documented the same artifact finalization 403 from
actions/upload-artifact; re-running the job resolved the failure. - Issue [CI Failure Doctor] CI Failure Investigation - Run #35768 #15789 (Run #35768) showed how a missing integration artifact leads
canary_goto report thousands of missing tests, so this run’s five missing tests are the same downstream symptom caused by a different upstream failure.
🩺 Diagnosis provided by CI Failure Doctor
To install this workflow, run
gh aw add githubnext/agentics/workflows/ci-doctor.md@ea350161ad5dcc9624cf510f134c6a9e39a6f94d. View source at https://github.com/githubnext/agentics/tree/ea350161ad5dcc9624cf510f134c6a9e39a6f94d/workflows/ci-doctor.md.
- expires on Feb 20, 2026, 2:39 PM UTC