Skip to content

Comments

Fix issue where Gemini CLI creates tests in a new file#18409

Merged
gundermanc merged 14 commits intomainfrom
gundermanc/stop-duplicating-tests
Feb 10, 2026
Merged

Fix issue where Gemini CLI creates tests in a new file#18409
gundermanc merged 14 commits intomainfrom
gundermanc/stop-duplicating-tests

Conversation

@gundermanc
Copy link
Member

@gundermanc gundermanc commented Feb 6, 2026

Summary

Fixes an issue where Gemini CLI either does not add new tests or creates them in a new file instead of placing them next to existing tests.

Adds an eval test: https://github.com/google-gemini/gemini-cli/actions/runs/21807626011

Related Issues

Fixes #18361

How to Validate

Pre-Merge Checklist

  • Updated relevant documentation and README (if needed)
  • Added/updated tests (if needed)
  • Noted breaking changes (if any)
  • Validated on required platforms/methods:
    • MacOS
      • npm run
      • npx
      • Docker
      • Podman
      • Seatbelt
    • Windows
      • npm run
      • npx
      • Docker
    • Linux
      • npm run
      • npx
      • Docker

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @gundermanc, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refines the AI model's behavior regarding test file management. It introduces a new evaluation test to validate that the system correctly updates existing test files instead of generating redundant ones. Concurrently, it modifies the core mandates to explicitly guide the AI to prioritize updating related tests and creating new tests within existing, relevant test files, thereby preventing the duplication of test code and ensuring more efficient test suite maintenance.

Highlights

  • New Evaluation Test for Test File Management: A new evaluation test (evals/edit-locations-eval.eval.ts) has been added. This test specifically verifies that the system updates existing test files (e.g., src/math.test.ts) when a bug is fixed in a corresponding source file (e.g., src/math.ts), ensuring that new, duplicate test files are not created.
  • Updated Core Mandates for Test Handling: The core prompts (packages/core/src/prompts/snippets.ts) have been modified to include a new, explicit 'Testing' mandate. This mandate instructs the model to always search for and update related tests after making code changes, and to add new tests within existing, related test files or create new ones if no related file exists.
  • Refined 'Proactiveness' Mandate: The existing 'Proactiveness' mandate was adjusted to remove the specific instruction about adding tests, as the new 'Testing' mandate now covers this aspect more precisely and comprehensively.
Changelog
  • evals/edit-locations-eval.eval.ts
    • Added a new evaluation test Edits location eval to verify that the system updates existing test files (e.g., src/math.test.ts) when fixing a bug in a source file (e.g., src/math.ts), rather than creating a new test file.
  • packages/core/src/prompts/snippets.ts
    • Modified the 'Proactiveness' mandate to focus on considering all created files as permanent artifacts, removing the explicit mention of adding tests for quality.
    • Introduced a new 'Testing' mandate, instructing the model to always search for and update related tests after code changes, and to add new tests in related test files or create new ones if a related file doesn't exist.
Ignored Files
  • Ignored by pattern: .gemini/** (1)
    • .gemini/commands/fix-behavioral-eval.toml
Activity
  • No specific human activity (comments, reviews, progress updates) has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new evaluation test and updates the core system prompt to prevent the agent from creating duplicate test files when existing ones are present. The changes are well-aligned with the goal. I've suggested a minor wording tweak to the new prompt instruction to make it even clearer for the language model, further reducing the chance of incorrect behavior.

- **Comments:** Add code comments sparingly. Focus on *why* something is done, especially for complex logic, rather than *what* is done. Only add high-value comments if necessary for clarity or if requested by the user. Do not edit comments that are separate from the code you are changing. *NEVER* talk to the user or describe your changes through comments.
- **Proactiveness:** Fulfill the user's request thoroughly. When adding features or fixing bugs, this includes adding tests to ensure quality. Consider all created files, especially tests, to be permanent artifacts unless the user says otherwise.
- **Proactiveness:** Fulfill the user's request thoroughly. Consider all created files, especially tests, to be permanent artifacts unless the user says otherwise.
- **Testing**: ALWAYS search for and update related tests after making a code change. Add new tests in a related test file, if one exists, or create a new test file. e.g.: fix a bug in set.ts, add a test in set.test.ts.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The phrasing Add new tests in a related test file, if one exists, or create a new test file could be ambiguous to the language model. A more direct conditional statement would make the instruction clearer and more robust, ensuring the model prioritizes updating existing test files.

Suggested change
- **Testing**: ALWAYS search for and update related tests after making a code change. Add new tests in a related test file, if one exists, or create a new test file. e.g.: fix a bug in set.ts, add a test in set.test.ts.
- **Testing**: ALWAYS search for and update related tests after making a code change. If a related test file exists, add new tests to it; otherwise, create a new test file. e.g.: fix a bug in set.ts, add a test in set.test.ts.

@gundermanc gundermanc changed the title Gundermanc/stop duplicating tests Fix issue where Gemini CLI creates tests in a new file Feb 6, 2026
@github-actions
Copy link

github-actions bot commented Feb 6, 2026

Size Change: +205 B (0%)

Total Size: 23.9 MB

ℹ️ View Unchanged
Filename Size Change
./bundle/gemini.js 23.9 MB +205 B (0%)
./bundle/sandbox-macos-permissive-closed.sb 1.03 kB 0 B
./bundle/sandbox-macos-permissive-open.sb 890 B 0 B
./bundle/sandbox-macos-permissive-proxied.sb 1.31 kB 0 B
./bundle/sandbox-macos-restrictive-closed.sb 3.29 kB 0 B
./bundle/sandbox-macos-restrictive-open.sb 3.36 kB 0 B
./bundle/sandbox-macos-restrictive-proxied.sb 3.56 kB 0 B

compressed-size-action

@gundermanc gundermanc marked this pull request as ready for review February 9, 2026 00:07
@gundermanc gundermanc requested a review from a team as a code owner February 9, 2026 00:07
@gemini-cli gemini-cli bot added area/agent Issues related to Core Agent, Tools, Memory, Sub-Agents, Hooks, Agent Quality 🔒 maintainer only ⛔ Do not contribute. Internal roadmap item. priority/p2 Important but can be addressed in a future release. labels Feb 9, 2026
@alisa-alisa
Copy link
Contributor

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to fix an issue where the Gemini CLI creates a new test file instead of updating an existing one. This is addressed by updating the core system prompt to explicitly instruct the agent to search for and update existing test files. A new evaluation test is added to verify this behavior. The changes to the prompt and the addition of the test are well-aligned with the goal. I've found one issue in the new test file where an assertion could be made more specific to better reflect the test's intent.

Comment on lines +100 to +103
expect(
new Set(targetFiles).size,
'Expected only two files changed',
).greaterThanOrEqual(2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The assertion message 'Expected only two files changed' suggests an exact check, but greaterThanOrEqual(2) allows for more than two files to be changed. To make the test stricter and align with the stated expectation, you should check for an exact size of 2.

      expect(new Set(targetFiles).size, 'Expected only two files changed').toBe(2);

@gundermanc gundermanc enabled auto-merge February 10, 2026 19:08
@gundermanc gundermanc added this pull request to the merge queue Feb 10, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 10, 2026
@gundermanc gundermanc added this pull request to the merge queue Feb 10, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 10, 2026
@gundermanc gundermanc enabled auto-merge February 10, 2026 19:57
@gundermanc gundermanc added this pull request to the merge queue Feb 10, 2026
Merged via the queue into main with commit 8b76211 Feb 10, 2026
27 checks passed
@gundermanc gundermanc deleted the gundermanc/stop-duplicating-tests branch February 10, 2026 21:05
krsjenmt added a commit to krsjenmt/gemini-cli that referenced this pull request Feb 11, 2026
* Fix newline insertion bug in replace tool (google-gemini#18595)

* fix(evals): update save_memory evals and simplify tool description (google-gemini#18610)

* chore(evals): update validation_fidelity_pre_existing_errors to USUALLY_PASSES (google-gemini#18617)

* fix: shorten tool call IDs and fix duplicate tool name in truncated output filenames (google-gemini#18600)

* feat(cli): implement atomic writes and safety checks for trusted folders (google-gemini#18406)

* Remove relative docs links (google-gemini#18650)

* docs: add legacy snippets convention to GEMINI.md (google-gemini#18597)

* fix(chore): Support linting for cjs (google-gemini#18639)

Co-authored-by: Gal Zahavi <38544478+galz10@users.noreply.github.com>

* feat: move shell efficiency guidelines to tool description (google-gemini#18614)

* Added "" as default value, since getText() used to expect a string only and thus crashed when undefined...  Fixes google-gemini#18076   (google-gemini#18099)

* Allow @-includes outside of workspaces (with permission) (google-gemini#18470)

* chore: make `ask_user` header description more clear (google-gemini#18657)

* bug(core): Fix minor bug in migration logic. (google-gemini#18661)

* Harded code assist converter. (google-gemini#18656)

* refactor(core): model-dependent tool definitions (google-gemini#18563)

* feat: enable plan mode experiment in settings (google-gemini#18636)

* refactor: push isValidPath() into parsePastedPaths() (google-gemini#18664)

* fix(cli): correct 'esc to cancel' position and restore duration display (google-gemini#18534)

* feat(cli): add DevTools integration with gemini-cli-devtools (google-gemini#18648)

* chore: remove unused exports and redundant hook files (google-gemini#18681)

* Fix number of lines being reported in rewind confirmation dialog (google-gemini#18675)

* feat(cli): disable folder trust in headless mode (google-gemini#18407)

* Disallow unsafe type assertions (google-gemini#18688)

* Change event type for release (google-gemini#18693)

* feat: handle multiple dynamic context filenames in system prompt (google-gemini#18598)

* Properly parse at-commands with narrow non-breaking spaces (google-gemini#18677)

* refactor(core): centralize core tool definitions and support model-specific schemas (google-gemini#18662)

* feat(core): Render memory hierarchically in context. (google-gemini#18350)

* feat: Ctrl+O to expand paste placeholder (google-gemini#18103)

* fix(cli): Improve header spacing (google-gemini#18531)

* Feature/quota visibility 16795 (google-gemini#18203)

* docs: remove TOC marker from Plan Mode header (google-gemini#18678)

* Inline thinking bubbles with summary/full modes (google-gemini#18033)

Co-authored-by: Jacob Richman <jacob314@gmail.com>

* fix(ui): remove redundant newlines in Gemini messages (google-gemini#18538)

* test(cli): fix AppContainer act() warnings and improve waitFor resilience (google-gemini#18676)

* refactor(core): refine Security & System Integrity section in system prompt (google-gemini#18601)

* Fix layout rounding. (google-gemini#18667)

* docs(skills): enhance pr-creator safety and interactivity (google-gemini#18616)

* test(core): remove hardcoded model from TestRig (google-gemini#18710)

* feat(core): optimize sub-agents system prompt intro (google-gemini#18608)

* feat(cli): update approval mode labels and shortcuts per latest UX spec (google-gemini#18698)

* fix(plan): update persistent approval mode setting (google-gemini#18638)

Co-authored-by: Sandy Tao <sandytao520@icloud.com>

* fix: move toasts location to left side (google-gemini#18705)

* feat(routing): restrict numerical routing to Gemini 3 family (google-gemini#18478)

* fix(ide): fix ide nudge setting (google-gemini#18733)

* fix(core): standardize tool formatting in system prompts (google-gemini#18615)

* chore: consolidate to green in ask user dialog (google-gemini#18734)

* feat: add `extensionsExplore` setting to enable extensions explore UI. (google-gemini#18686)

* feat(cli): defer devtools startup and integrate with F12 (google-gemini#18695)

* ui: update & subdue footer colors and animate progress indicator (google-gemini#18570)

* test: add model-specific snapshots for coreTools (google-gemini#18707)

Co-authored-by: matt korwel <matt.korwel@gmail.com>

* ci: shard windows tests and fix event listener leaks (google-gemini#18670)

* fix: allow `ask_user` tool in yolo mode (google-gemini#18541)

* feat: redact disabled tools from system prompt (google-gemini#13597) (google-gemini#18613)

* Update Gemini.md to use the curent year on creating new files (google-gemini#18460)

* Code review cleanup for thinking display (google-gemini#18720)

* fix(cli): hide scrollbars when in alternate buffer copy mode (google-gemini#18354)

Co-authored-by: Jacob Richman <jacob314@gmail.com>

* Fix issues with rip grep (google-gemini#18756)

* fix(cli): fix history navigation regression after prompt autocomplete (google-gemini#18752)

* chore: cleanup unused and add unlisted dependencies in packages/cli (google-gemini#18749)

* Fix issue where Gemini CLI creates tests in a new file (google-gemini#18409)

* feat(telemetry): Ensure experiment IDs are included in OpenTelemetry logs (google-gemini#18747)

* feat(ux): added text wrapping capabilities to markdown tables (google-gemini#18240)

Co-authored-by: jacob314 <jacob314@gmail.com>

* Revert "fix(mcp): ensure MCP transport is closed to prevent memory leaks" (google-gemini#18771)

* chore(release): bump version to 0.30.0-nightly.20260210.a2174751d (google-gemini#18772)

* chore: cleanup unused and add unlisted dependencies in packages/core (google-gemini#18762)

* chore(core): update activate_skill prompt verbiage to be more direct (google-gemini#18605)

* Add autoconfigure memory usage setting to the dialog (google-gemini#18510)

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* fix(core): prevent race condition in policy persistence (google-gemini#18506)

Co-authored-by: Allen Hutchison <adh@google.com>

* fix(evals): prevent false positive in hierarchical memory test (google-gemini#18777)

* test(evals): mark all `save_memory` evals as `USUALLY_PASSES` due to unreliability (google-gemini#18786)

* feat(cli): add setting to hide shortcuts hint UI (google-gemini#18562)

* feat(core): formalize 5-phase sequential planning workflow (google-gemini#18759)

* Introduce limits for search results. (google-gemini#18767)

---------

Co-authored-by: Andrew Garrett <andrewgarrett@google.com>
Co-authored-by: N. Taylor Mullen <ntaylormullen@google.com>
Co-authored-by: Sandy Tao <sandytao520@icloud.com>
Co-authored-by: Gal Zahavi <38544478+galz10@users.noreply.github.com>
Co-authored-by: christine betts <chrstn@uw.edu>
Co-authored-by: Aswin Ashok <aswwwin@google.com>
Co-authored-by: Abhijith V Ashok <abhi2349jith@gmail.com>
Co-authored-by: Tommaso Sciortino <sciortino@gmail.com>
Co-authored-by: Jack Wotherspoon <jackwoth@google.com>
Co-authored-by: joshualitt <joshualitt@google.com>
Co-authored-by: Jacob Richman <jacob314@gmail.com>
Co-authored-by: Aishanee Shah <aishaneeshah@gmail.com>
Co-authored-by: Jerop Kipruto <jerop@google.com>
Co-authored-by: Adib234 <30782825+Adib234@users.noreply.github.com>
Co-authored-by: Christian Gunderman <gundermanc@gmail.com>
Co-authored-by: g-samroberts <158088236+g-samroberts@users.noreply.github.com>
Co-authored-by: Spencer <spencertang@google.com>
Co-authored-by: Dmitry Lyalin <dmitry.lyalin@lyalin.com>
Co-authored-by: matt korwel <matt.korwel@gmail.com>
Co-authored-by: Shreya Keshive <shreyakeshive@google.com>
Co-authored-by: Sri Pasumarthi <111310667+sripasg@users.noreply.github.com>
Co-authored-by: Keith Guerin <keithguerin@gmail.com>
Co-authored-by: Sehoon Shon <sshon@google.com>
Co-authored-by: Adam Weidman <65992621+adamfweidman@users.noreply.github.com>
Co-authored-by: Kevin Ramdass <ramdass.kevin@gmail.com>
Co-authored-by: Dev Randalpura <devrandalpura@google.com>
Co-authored-by: gemini-cli-robot <gemini-cli-robot@google.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Brad Dux <959674+braddux@users.noreply.github.com>
Co-authored-by: Allen Hutchison <adh@google.com>
Co-authored-by: Abhijit Balaji <abhijitbalaji@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/agent Issues related to Core Agent, Tools, Memory, Sub-Agents, Hooks, Agent Quality 🔒 maintainer only ⛔ Do not contribute. Internal roadmap item. priority/p2 Important but can be addressed in a future release.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Gemini CLI often creates tests in a new file instead of an existing one

2 participants