feat: add agent and mcp config file path flag to eval cmd #49

jrangelramos · 2025-11-19T20:50:15Z

Currently mcp config file and agent file are specified on the evals spec file. For the cases you want to test your evals against different agents or launching your MCP with different parameters (or envs) you need to create multiple evals files.

Proposed Change

Add 2 new optional flags to the evals command, which will allow inform the agent and mcp config file from the command line. In case the values are speficied on the eval.yaml file, the values from command line will override.

$ gevals eval -h                                                                                 
Run an evaluation using the specified eval configuration file.

Usage:
  geval eval [eval-config-file] [flags]

Flags:
      --agent-file string        Path to agent file (overrides value in eval config)
  -h, --help                     help for eval
      --mcp-config-file string   Path to MCP config file (overrides value in eval config)
  -o, --output string            Output format (text, json) (default "text")
  -r, --run string               Regular expression to match task names to run (unanchored, like go test -run)
  -v, --verbose                  Verbose output

Summary by CodeRabbit

New Features
- Added two CLI flags: --mcp-config-file and --agent-file to override MCP config and specify an agent file when running evaluations.
- Both flags accept file paths (relative paths are resolved to absolute), and specifying an agent file runs the evaluation using that file-backed agent.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-11-19T20:50:26Z

Walkthrough

Adds two new CLI flags (--mcp-config-file, --agent-file) to run command; resolves provided file paths to absolute paths and applies them as overrides to the loaded eval config before creating the runner, with error handling for path resolution and ensuring spec.Config.Agent and Agent.Type are set.

Changes

Cohort / File(s)	Summary
CLI flags & config overrides `pkg/cli/run.go`	Adds `--mcp-config-file` and `--agent-file` flags; introduces `overrideFile` helper to resolve relative paths to absolute and apply them to `spec.Config.McpConfigFile` and `spec.Config.Agent.Path`; ensures `spec.Config.Agent` exists and sets `Agent.Type = "file"` for agent overrides; applies overrides after config load and before runner creation with error handling.

Sequence Diagram(s)

sequenceDiagram
  participant CLI
  participant Loader
  participant Override
  participant Runner

  CLI->>Loader: parse flags & load eval config
  Loader-->>CLI: eval config
  CLI->>Override: if --mcp-config-file set -> resolve path
  CLI->>Override: if --agent-file set -> ensure Agent, set Type=file, resolve path
  Override-->>CLI: apply resolved paths into eval config (or return error)
  CLI->>Runner: create runner with modified config
  Runner-->>CLI: runner started / error
  note over Override: Path resolution may return errors (handled before runner creation)

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Single-file change with straightforward additions.
Pay attention to:
- Path resolution and error messages.
- Correct initialization of spec.Config.Agent and setting Agent.Type.
- That absolute-path conversion preserves empty/no-override behavior.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically describes the main change: adding two new CLI flags (--agent-file and --mcp-config-file) to the eval command, which aligns perfectly with the code changes.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

Provide your own instructions using the high_level_summary_instructions setting.
Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

📝 Description — Summarize the main change in 50–60 words, explaining what was done.

📓 References — List relevant issues, discussions, documentation, or related PRs.

📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.

📊 Contributor Summary — Include a Markdown table showing contributions:
| Contributor | Lines Added | Lines Removed | Files Changed |

✔️ Additional Notes — Add any extra reviewer context.
Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

pkg/cli/run.go (1)
39-62: Override logic works correctly.

The implementation properly resolves relative paths to absolute paths and applies overrides only when flags are provided. Error handling is appropriate.

Consider this minor refactor for clarity in the overrideFile helper:
 overrideFile := func(specFile *string, fileName string) error {
     if fileName != "" {
         if !filepath.IsAbs(fileName) {
             absPath, err := filepath.Abs(fileName)
             if err != nil {
                 return err
             }
-            fileName = absPath
+            *specFile = absPath
+        } else {
+            *specFile = fileName
         }
-        *specFile = fileName
     }
     return nil
 }
Note: Relative paths are resolved relative to the current working directory, not the eval config file location. This is standard filepath.Abs() behavior but may be unexpected for users who assume paths are relative to the config file.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6a129cc and c90760d.

📒 Files selected for processing (1)

pkg/cli/run.go (3 hunks)

🔇 Additional comments (2)

pkg/cli/run.go (2)

22-23: LGTM!

Variable declarations for the new CLI flags are clear and follow the established pattern.

100-101: LGTM!

Flag declarations are clear, well-documented, and align perfectly with the PR objectives.

jrangelramos · 2025-11-24T15:24:43Z

@Cali0707

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

pkg/cli/run.go (1)
39-49: Consider simplifying the path assignment logic.

The local fileName variable is reassigned before being used, which works but could be clearer.

Apply this diff to make the logic more direct:
 overrideFile := func(specFile *string, fileName string) error {
   if !filepath.IsAbs(fileName) {
     absPath, err := filepath.Abs(fileName)
     if err != nil {
       return err
     }
-    fileName = absPath
+    *specFile = absPath
+  } else {
+    *specFile = fileName
   }
-  *specFile = fileName
   return nil
 }

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c90760d and e1eb6b2.

📒 Files selected for processing (1)

pkg/cli/run.go (3 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

pkg/cli/run.go (2)

pkg/openaiagent/agent.go (1)

Agent (13-15)

pkg/eval/config.go (1)

AgentRef (41-53)

🔇 Additional comments (4)

pkg/cli/run.go (4)

22-23: LGTM!

The flag variable declarations are straightforward and consistent with the existing flag patterns in the function.

58-68: LGTM! Type override is working as intended.

The agent override logic correctly:

Ensures the Agent reference exists before setting fields

Sets Type to "file" unconditionally, which aligns with the PR objective of overriding the config value when the flag is provided

Preserves other fields like Model if they were previously set

106-107: LGTM!

The flag bindings are correct and the help text clearly communicates that these flags override values from the eval config.

50-56: File existence validation is properly handled downstream.

The MCP config file override logic correctly resolves the path via the overrideFile function. File existence validation occurs later in ParseConfigFile (pkg/mcpproxy/config.go:62), which uses os.ReadFile() and returns a clear error if the file is missing or unreadable. This is the appropriate place for validation since the file is read at that point, providing accurate error handling.

Cali0707 · 2025-11-25T19:26:53Z

Hey @jrangelramos this looks good to me overall, however with the addition of "builtin" agents in #38, how would you picture overriding to a built in agent? Would I need to create a agent.yaml referencing the built in agent, or would there be some way with the flag to specify that I want a "builtin.claude-code" or similar agent?

IMO, we should have a way to support the second if it doesn't overcomplicate the UX of this command too much. WDYT?

coderabbitai bot reviewed Nov 19, 2025

View reviewed changes

feat: add agent and mcp config file path flag to eval cmd

e1eb6b2

jrangelramos force-pushed the eval-path-flags branch from c90760d to e1eb6b2 Compare November 25, 2025 18:16

coderabbitai bot reviewed Nov 25, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add agent and mcp config file path flag to eval cmd #49

feat: add agent and mcp config file path flag to eval cmd #49

Uh oh!

jrangelramos commented Nov 19, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Nov 19, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

jrangelramos commented Nov 24, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

Cali0707 commented Nov 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: add agent and mcp config file path flag to eval cmd #49

Are you sure you want to change the base?

feat: add agent and mcp config file path flag to eval cmd #49

Uh oh!

Conversation

jrangelramos commented Nov 19, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed Change

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

jrangelramos commented Nov 24, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Cali0707 commented Nov 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jrangelramos commented Nov 19, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 19, 2025 •

edited

Loading