Update prompt and grep tool definition to limit context size#18780
Update prompt and grep tool definition to limit context size#18780gundermanc merged 16 commits intomainfrom
Conversation
Summary of ChangesHello @gundermanc, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Size Change: +3.87 kB (+0.02%) Total Size: 24.3 MB
ℹ️ View Unchanged
|
There was a problem hiding this comment.
Code Review
This pull request introduces several useful parameters to the grep and ripgrep tools, such as total_max_matches, max_matches_per_file, exclude_pattern, and names_only, to control output and improve efficiency. A high-severity argument injection vulnerability was identified in the grep_search tool implementation (grep.ts), where the search pattern could be misinterpreted as a command-line option. This has been addressed by using the -e flag in external command calls, with ripgrep remaining unaffected due to its existing use of the --regexp flag. Additionally, a critical issue was found where the system prompt suggests an incorrect parameter name to the LLM, leading to tool call failures.
| - **Source Control:** Do not stage or commit changes unless specifically requested by the user. | ||
|
|
||
| ## Context Efficiency: | ||
| - Minimize wasted context window by scoping and limiting all of your ${GREP_TOOL_NAME} searches. e.g.: pass max_total_matches, include, and max_matches_per_file. |
There was a problem hiding this comment.
There's a mismatch between the parameter name suggested in this prompt (max_total_matches) and the actual parameter name defined in the tool's schema (total_max_matches). The LLM will follow the prompt's instruction, leading to failed tool calls because the parameter name will be incorrect. Please update the prompt to use the correct parameter name total_max_matches.
| - Minimize wasted context window by scoping and limiting all of your ${GREP_TOOL_NAME} searches. e.g.: pass max_total_matches, include, and max_matches_per_file. | |
| - Minimize wasted context window by scoping and limiting all of your ${GREP_TOOL_NAME} searches. e.g.: pass total_max_matches, include, and max_matches_per_file. |
| '-n', | ||
| '-E', | ||
| '--ignore-case', | ||
| pattern, |
There was a problem hiding this comment.
The pattern parameter is passed as a positional argument to git grep. If the pattern starts with a hyphen (-), it can be interpreted as a command-line option rather than a search pattern. This allows for argument injection, which can be used to manipulate the command's behavior or potentially leak information (e.g., using -f to read patterns from sensitive files).
To fix this, use the -e flag to explicitly specify that the following argument is the pattern.
| pattern, | |
| '-e', pattern, |
f92afd0 to
12340dc
Compare
12340dc to
4b0c9ce
Compare
379c823 to
838ea7a
Compare
| /** | ||
| * Optional: If true, only the file paths of the matches will be returned. | ||
| */ | ||
| names_only?: boolean; |
There was a problem hiding this comment.
nit: maybe file_paths_only then?
There was a problem hiding this comment.
Won't fixing this one. The model seems to understand this one.
| /** | ||
| * Optional: If true, only the file paths of the matches will be returned. | ||
| */ | ||
| names_only?: boolean; |
There was a problem hiding this comment.
nit: same as in the other file.
| RipGrepTool.Name, | ||
| 'SearchText', | ||
| 'Searches for a regular expression pattern within file contents. Max 100 matches.', | ||
| 'Searches for a regular expression pattern within file contents.', |
There was a problem hiding this comment.
why do we remove the max?
There was a problem hiding this comment.
+1. I'm okay removing this, but should we update GREP_DEFINITION in coreTools.ts for consistency?
| import { describe, expect } from 'vitest'; | ||
| import { evalTest } from './test-helper.js'; | ||
|
|
||
| describe('Frugal Search', () => { |
There was a problem hiding this comment.
nit: for me it is not obvious what this evals do - we don't have frugal search tool or anything, so I imagine, it might be hard to remember in a year or too. I would add a comment or smth.
There was a problem hiding this comment.
Thanks for the feedback. In the interest of keeping momentum (particularly avoiding the need to re-run evals), I'm going to take these in an immediate followup.
| - **Source Control:** Do not stage or commit changes unless specifically requested by the user. | ||
|
|
||
| ## Context Efficiency: | ||
| - Always minimize wasted context window by scoping and limiting all of your ${GREP_TOOL_NAME} searches. e.g.: pass total_max_matches, include, and max_matches_per_file. |
There was a problem hiding this comment.
Consider updating the mandate with more refined language, an explicit "why," and a bit more instruction:
Search Frugality: ALWAYS scope and limit your searches to avoid context window exhaustion and ensure high-signal results. Use include to target relevant files and strictly limit results using total_max_matches and max_matches_per_file. For broad discovery, prefer names_only to identify files without retrieving their content.
…ini/gemini-cli (#37) * fix(cli): resolve double rendering in shpool and address vscode lint warnings (google-gemini#18704) * feat(plan): document and validate Plan Mode policy overrides (google-gemini#18825) * Fix pressing any key to exit select mode. (google-gemini#18421) * fix(cli): update F12 behavior to only open drawer if browser fails (google-gemini#18829) * feat(plan): allow skills to be enabled in plan mode (google-gemini#18817) Co-authored-by: Jerop Kipruto <jerop@google.com> * docs(plan): add documentation for plan mode tools (google-gemini#18827) * Remove experimental note in extension settings docs (google-gemini#18822) * Update prompt and grep tool definition to limit context size (google-gemini#18780) * docs(plan): add `ask_user` tool documentation (google-gemini#18830) * Revert unintended credentials exposure (google-gemini#18840) * feat(core): update internal utility models to Gemini 3 (google-gemini#18773) * feat(a2a): add value-resolver for auth credential resolution (google-gemini#18653) * Removed getPlainTextLength (google-gemini#18848) * More grep prompt tweaks (google-gemini#18846) * refactor(cli): Reactive useSettingsStore hook (google-gemini#14915) * fix(mcp): Ensure that stdio MCP server execution has the `GEMINI_CLI=1` env variable populated. (google-gemini#18832) * fix(core): improve headless mode detection for flags and query args (google-gemini#18855) * refactor(cli): simplify UI and remove legacy inline tool confirmation logic (google-gemini#18566) * feat(cli): deprecate --allowed-tools and excludeTools in favor of policy engine (google-gemini#18508) * fix(workflows): improve maintainer detection for automated PR actions (google-gemini#18869) * refactor(cli): consolidate useToolScheduler and delete legacy implementation (google-gemini#18567) * Update changelog for v0.28.0 and v0.29.0-preview0 (google-gemini#18819) * fix(core): ensure sub-agents are registered regardless of tools.allowed (google-gemini#18870) --------- Co-authored-by: Brad Dux <959674+braddux@users.noreply.github.com> Co-authored-by: Jerop Kipruto <jerop@google.com> Co-authored-by: Jacob Richman <jacob314@gmail.com> Co-authored-by: Sandy Tao <sandytao520@icloud.com> Co-authored-by: Adib234 <30782825+Adib234@users.noreply.github.com> Co-authored-by: christine betts <chrstn@uw.edu> Co-authored-by: Christian Gunderman <gundermanc@gmail.com> Co-authored-by: Adam Weidman <65992621+adamfweidman@users.noreply.github.com> Co-authored-by: Dev Randalpura <devrandalpura@google.com> Co-authored-by: Pyush Sinha <pyushsinha20@gmail.com> Co-authored-by: Richie Foreman <richie.foreman@gmail.com> Co-authored-by: Gal Zahavi <38544478+galz10@users.noreply.github.com> Co-authored-by: Abhi <43648792+abhipatel12@users.noreply.github.com> Co-authored-by: Abhijit Balaji <abhijitbalaji@google.com> Co-authored-by: Bryan Morgan <bryanmorgan@google.com> Co-authored-by: g-samroberts <158088236+g-samroberts@users.noreply.github.com> Co-authored-by: matt korwel <matt.korwel@gmail.com>
Summary
Steer the agent to use bounded searches and/or advanced searches.
Adds some parameters commonly used in grep based pipelines to narrow searches.
Related Issues
#17112
How to Validate
Pre-Merge Checklist