Skip to content

Fix git core compatibility harness baseline handling#482

Merged
svarlamov merged 3 commits intomainfrom
codex/create-github-action-for-core-git-tests
Feb 8, 2026
Merged

Fix git core compatibility harness baseline handling#482
svarlamov merged 3 commits intomainfrom
codex/create-github-action-for-core-git-tests

Conversation

@svarlamov
Copy link
Member

@svarlamov svarlamov commented Feb 8, 2026

Motivation

  • Ensure the Git core compatibility harness is runnable in CI and locally by building the upstream Git checkout before running tests.
  • Avoid reporting normal test non-zero exits as harness errors and instead only flag true harness/parsing issues.
  • Record the current observed failures as a baseline whitelist so CI only fails on new regressions.

Description

  • Add ensure_git_build() to tests/git-compat/run-core-tests.py to run make in the cloned Git repo when GIT-BUILD-OPTIONS is missing.
  • Set a default test hash via GIT_TEST_DEFAULT_HASH=sha1 in the harness environment so tests that expect SHA-1 run correctly.
  • Parse and report only TAP/parse errors as harness issues (refined regex and parse_summary_issues()), while continuing to parse test failures for whitelist comparison.
  • Update tests/git-compat/whitelist.csv with the current baseline of failing tests so the harness only fails for new, unexpected regressions.

Testing

  • Built the project with cargo build --release --bin git-ai which completed successfully.
  • Ran the harness with python3 tests/git-compat/run-core-tests.py, which cloned and built upstream Git, ran the selected prove test subset, and returned success because all remaining failures were applied to the updated whitelist.
  • The harness now exits 0 when tests are either passing or explicitly whitelisted, and returns non-zero when parse/harness errors are detected.

Codex Task


Open with Devin

Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 6 additional findings.

Open in Devin Review

Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 10 additional findings in Devin Review.

Open in Devin Review

Comment on lines +286 to +288
if exit_code != 0 and not failures:
print("\n[!] prove exited non-zero but no failures were parsed. Please investigate the output above.")
return exit_code
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Safety check not failures bypassed by empty-list entries from tests with Wstat but no individual failures

The fallback safety check at line 286 (if exit_code != 0 and not failures) is intended to catch cases where prove exits non-zero but no failure information was parsed. However, parse_failures creates entries with empty sets for any test matching the header_re pattern (Wstat line) even when no Failed tests: line follows—for example, a test script that crashes before running any assertions or exits non-zero without individual test failures.

Root Cause and Impact

In parse_failures at tests/git-compat/run-core-tests.py:127-129, any test matching the header regex gets a dict entry via failures.setdefault(current, set()), regardless of whether a Failed tests: line follows. When converted to the return value at line 153, this produces entries like {"t0000-basic.sh": []} (empty list).

An empty list is falsy, but a dict containing such entries is truthy:

failures = {"t0000-basic.sh": []}  # bool(failures) == True

At line 286, not failures evaluates to False, so the safety check is skipped. Meanwhile, apply_whitelist at line 269 produces an empty unexpected dict (no indices to filter). With no summary_issues either, the harness falls through to line 290 and returns 0 (success), even though prove exited non-zero for an unparsed reason.

Impact: The harness silently reports success when a test script crashes or has a non-zero Wstat without individual test failures, defeating the purpose of the safety check.

Suggested change
if exit_code != 0 and not failures:
print("\n[!] prove exited non-zero but no failures were parsed. Please investigate the output above.")
return exit_code
has_parsed_failures = any(indices for indices in failures.values())
if exit_code != 0 and not has_parsed_failures:
print("\n[!] prove exited non-zero but no failures were parsed. Please investigate the output above.")
return exit_code
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@svarlamov svarlamov merged commit 48e1dc8 into main Feb 8, 2026
9 checks passed
@svarlamov svarlamov deleted the codex/create-github-action-for-core-git-tests branch February 8, 2026 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant