feat(core): overhaul system prompt for rigor, integrity, and intent alignment by NTaylorMullen · Pull Request #17263 · google-gemini/gemini-cli

NTaylorMullen · 2026-01-22T01:37:10Z

Summary

This PR overhauls the system prompt for Gemini CLI to improve engineering rigor, technical integrity, and alignment with user intent. It introduces a structured Research -> Strategy -> Execution lifecycle while maintaining legacy compatibility for non-preview models.

Details

Refactored Lifecycle: Moves from a flat instruction set to a structured Research -> Strategy -> Execution workflow, allowing for better discovery and planning phases.
Surgical Implementation: Mandates an iterative Plan -> Act -> Validate cycle for the execution phase, prioritizing targeted code modifications, automated tests, and ecosystem tool usage (e.g., eslint --fix).
Intent Alignment: Implements a clear distinction between Inquiries (analysis/advice) and Directives (action) to prevent goal-creep and unintended modifications during research phases.
Modernized Technology Defaults: Updates "New Application" guidance to favor Vanilla CSS and modern, platform-appropriate tech stacks (React/TypeScript, FastAPI, Three.js) while emphasizing a polished, "alive" user experience.
Rigorous Validation: Establishes comprehensive verification (builds, tests, lints) as the mandatory path to finality, ensuring no regressions or structural side-effects.
Legacy Compatibility: Preserves existing prompt behavior by migrating current main logic to snippets.legacy.ts. The PromptProvider now dynamically selects between overhauled and legacy snippets based on whether the active model is a "preview model" (Gemini 3 family).
Test Integrity: Updated snapshots and unit tests in prompts.test.ts to validate the new prompt structure and ensure parity for legacy models.

Related Issues

Part of the ongoing effort to improve agent reliability and engineering quality.

How to Validate

Run core prompt tests: npm test -w @google/gemini-cli-core -- src/core/prompts.test.ts
Run compression service tests: npm test -w @google/gemini-cli-core -- src/services/chatCompressionService.test.ts
Perform full preflight validation: npm run preflight
Manually inspect prompt output via the CLI using a preview model (e.g., Gemini 3) and a legacy model (e.g., Gemini 2.5) to verify dynamic selection.

Pre-Merge Checklist

gemini-code-assist · 2026-01-22T01:37:42Z

Summary of Changes

Hello @NTaylorMullen, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on a major refactoring and optimization of the system prompt, which guides the agent's behavior and decision-making. The changes aim to enhance the agent's understanding of its role, improve its communication style, enforce stricter engineering standards, and refine its workflow for various tasks. By clarifying mandates, improving tool usage instructions, and adding new evaluation tests, the PR ensures the agent operates more predictably, efficiently, and in better alignment with user intent, particularly in complex software engineering scenarios.

Highlights

Comprehensive System Prompt Overhaul: The core system prompt has undergone a significant overhaul, introducing new sections for 'Communication Style', 'Security Protocols', 'Engineering Standards', and refined 'Workflow' definitions for development and new application creation. This aims to improve agent intent alignment, idiomatic completeness, and overall rigor.
Tool Renaming and Enhancement: The search_file_content tool has been renamed to grep_search across the codebase and documentation. Additionally, the grep_search tool now limits its output to a maximum of 100 matches by default to improve performance and token efficiency.
New Evaluation Tests for Agent Behavior: New evaluation tests (analysis-mode.eval.ts and delegation_strategy.eval.ts) have been added. These tests specifically validate that the agent does not automatically modify files when merely 'inspecting' for bugs, but acts when explicitly asked to 'fix' them. They also ensure correct delegation to specialized agents like codebase_investigator for architectural tasks.
Refined Agent Delegation Heuristics: The description for the CodebaseInvestigatorAgent has been updated to clarify its role in architectural analysis and dependency identification. The system prompt now provides clearer guidance on when to delegate to sub-agents versus using manual search tools.
Environment Context Refactoring: The way environment context (like workspace directories and folder structure) is passed to the agent has been refactored. It is now encapsulated within <session_context> tags in the initial user message, and dynamic environment details (date, platform, temp directory) are passed via a new PromptEnv object.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a comprehensive overhaul of the system prompt to improve agent behavior, along with several related refactorings and improvements. Key changes include a much more detailed and structured system prompt, renaming the search_file_content tool to grep_search, and refactoring how environment context is provided to the agent. New evaluation tests have been added to validate the new agent behaviors.

My review focuses on performance and correctness. I've identified a performance issue in how the match limit is applied in both the grep and ripgrep tools. The current implementation fetches all results and then truncates them in JavaScript, which can be inefficient. I've suggested using the native --max-count flags available in these tools to limit the results at the source.

Overall, this is a significant and well-structured update that should improve the agent's capabilities. The refactoring work makes the codebase cleaner and more maintainable.

packages/core/src/tools/grep.ts

packages/core/src/tools/ripGrep.ts

github-actions · 2026-01-22T19:58:36Z

Size Change: +39.8 kB (+0.17%)

Total Size: 23.9 MB

Filename	Size	Change
`./bundle/gemini.js`	23.8 MB	+39.8 kB (+0.17%)

ℹ️ View Unchanged

Filename	Size
`./bundle/sandbox-macos-permissive-closed.sb`	1.03 kB
`./bundle/sandbox-macos-permissive-open.sb`	890 B
`./bundle/sandbox-macos-permissive-proxied.sb`	1.31 kB
`./bundle/sandbox-macos-restrictive-closed.sb`	3.29 kB
`./bundle/sandbox-macos-restrictive-open.sb`	3.36 kB
`./bundle/sandbox-macos-restrictive-proxied.sb`	3.56 kB

_{compressed-size-action}

evals/analysis-mode.eval.ts

evals/delegation_strategy.eval.ts

packages/core/src/prompts/snippets.ts

gundermanc · 2026-01-28T20:28:10Z

I ran the current set of behavioral evals against this branch for all models: https://github.com/google-gemini/gemini-cli/actions/runs/21453547514

It looks like some of the existing ones might not be passing 3/3 times anymore with these changes. Any of these regressions?

Also the new tests don't seem to pass 3/3 times for Gemini 3.0 at least.

…l updates - Refine 'Expertise & Intent Alignment' to default to Inquiry and require explicit Directives for action. - Update 'Technical Integrity' and 'Execution' workflows to prioritize clean abstractions within the target scope while avoiding unrelated refactoring. - Clarify 'Proactiveness' to apply strictly when executing a Directive. - Add 'stop and wait' instruction for resolved inquiries or pending directives to stabilize workflows. - Update and verify prompt snapshots. Part of #17263

- Create `snippets.legacy.ts` as a pure replica of the original system prompt logic. - Introduce `snippets.ts` with the modern Gemini 3 prompt overhaul. - Update `PromptProvider.ts` to select between legacy and overhauled snippets based on the active model. - Make history compression model-aware by passing `config` to `getCompressionPrompt`. - Update unit tests and snapshots to verify correct prompt gating for preview and non-preview models. Part of #17263

…l updates - Refine 'Expertise & Intent Alignment' to default to Inquiry and require explicit Directives for action. - Update 'Technical Integrity' and 'Execution' workflows to prioritize clean abstractions within the target scope while avoiding unrelated refactoring. - Clarify 'Proactiveness' to apply strictly when executing a Directive. - Add 'stop and wait' instruction for resolved inquiries or pending directives to stabilize workflows. - Update and verify prompt snapshots. Part of #17263

- Create `snippets.legacy.ts` as a pure replica of the original system prompt logic. - Introduce `snippets.ts` with the modern Gemini 3 prompt overhaul. - Update `PromptProvider.ts` to select between legacy and overhauled snippets based on the active model. - Make history compression model-aware by passing `config` to `getCompressionPrompt`. - Update unit tests and snapshots to verify correct prompt gating for preview and non-preview models. Part of #17263

…lignment - Refactored system prompt structure into a Research/Strategy/Execution lifecycle - Modernized technology recommendations (favoring Vanilla CSS and modern stacks) - Integrated structured planning workflow and ecosystem tool checks - Preserved legacy prompt behavior by migrating current main logic to snippets.legacy.ts - Updated tests and snapshots for exhaustive validation

This is a follow up to #17263, ensuring consistency between snippets.ts and snippets.legacy.ts by removing the redundant planning section from renderFinalShell.

…lignment (google-gemini#17263)

NTaylorMullen requested review from a team as code owners January 22, 2026 01:37

gemini-code-assist bot reviewed Jan 22, 2026

View reviewed changes

packages/core/src/tools/grep.ts Outdated Show resolved Hide resolved

packages/core/src/tools/ripGrep.ts Outdated Show resolved Hide resolved

gemini-cli bot added the status/need-issue Pull requests that need to have an associated issue. label Jan 22, 2026

SandyTao520 requested a review from a team as a code owner January 22, 2026 19:55

SandyTao520 force-pushed the ntm/sys.prompt.overhaul branch from ece0d9e to d385ff8 Compare January 22, 2026 19:55

SandyTao520 force-pushed the ntm/sys.prompt.overhaul branch 4 times, most recently from 09d4488 to 814803b Compare January 27, 2026 19:43