chore: pass in mocker engine args directly in python cli + default frontend port to 8000 #2853

PeaBrane · 2025-09-03T23:59:50Z

Overview:

to mirror vllm cli closer, and additionally improve ergonomics

Usage for example is now, exactly as we would when launching python -m dynamo.vllm

python -m dynamo.mocker \
  --model-path TinyLlama/TinyLlama-1.1B-Chat-v1.0 \
  --num-gpu-blocks-override 8192 \
  --block-size 16 \
  --speedup-ratio 10.0 \
  --max-num-seqs 512 \
  --enable-prefix-caching

Tangential cli change: make default frontend port 8000 (k8 default)

Summary by CodeRabbit

New Features
- Introduced a vLLM-style CLI for the mocker backend with per-argument flags (e.g., --num-gpu-blocks-override, --block-size, --max-num-seqs, --max-num-batched-tokens, --enable-prefix-caching, --enable-chunked-prefill, --watermark, --speedup-ratio, --data-parallel-size).
- Supports passing options directly via CLI or via legacy JSON using --extra-engine-args (CLI flags are overridden by the JSON file).
- Improved logging and automatic cleanup of temporary config files.
Documentation
- Reorganized and expanded mocker README with required arguments, full option reference, usage examples, and legacy JSON guidance.

Signed-off-by: PeaBrane <yanrpei@gmail.com>

coderabbitai · 2025-09-04T00:08:52Z

Walkthrough

Updates introduce a vLLM-style CLI for the mocker backend, generating a temporary JSON of engine args when not provided. The worker selects between user-supplied JSON and generated args, with cleanup. Tests switch from file-based configs to in-memory dicts driving CLI flags. Documentation reflects the new interface.

Changes

Cohort / File(s)	Summary
Docs: mocker CLI `components/backends/mocker/README.md`	Rewrites mocker usage to a vLLM-style CLI; adds required args, detailed MockEngineArgs options, examples, legacy JSON support, and formatting updates.
Backend: mocker worker + CLI `components/backends/mocker/src/dynamo/mocker/main.py`	Adds create_temp_engine_args_file(args) to serialize non-None CLI options to JSON; worker prefers user --extra-engine-args otherwise generates temp; ensures cleanup; extends CLI with vLLM-like flags; adds logging and updated descriptions.
Tests: router e2e with mockers `tests/router/test_router_e2e_with_mockers.py`	Replaces file-based mocker_args.json with in-memory dict passed to MockerProcess; builds CLI flags per key; updates constructor signature; refactors test setups accordingly.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant User as User/Tests
  participant CLI as mocker.main.py (CLI)
  participant FS as Temp JSON (MockEngineArgs)
  participant Eng as Engine Entrypoint

  User->>CLI: Invoke mocker with CLI flags and/or --extra-engine-args
  alt --extra-engine-args provided
    CLI->>Eng: Build args using provided JSON path
  else Generate from CLI flags
    CLI->>FS: Serialize non-None MockEngineArgs to temp JSON
    CLI->>Eng: Build args using temp JSON path
  end
  Eng->>Eng: Start engine with MockEngineArgs
  note over CLI,FS: Finally: delete temp JSON if created

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

test: add router e2e test with mockers to per-merge ci #2073 — Introduced router E2E mocker test using file-based mocker_args; this PR refactors that flow to in-memory dicts and CLI flags.
feat: integrate mocker with dynamo-run and python cli #1927 — Worked on JSON/CLI plumbing for MockEngineArgs; closely related to new --extra-engine-args and serialization.
feat: enable / disable chunked prefill for mockers #2015 — Added enable_chunked_prefill option; this PR wires that through CLI and tests.

Poem

A rabbit twitches whiskers, bright,
New flags bloom softly in the night.
JSON dreams now temp and clean,
Engines hum with vLLM sheen.
Tests hop light—no files to tend,
CLI crumbs lead to the end.
Thump-thump: shipped, my furry friend! 🐇✨

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

components/backends/mocker/src/dynamo/mocker/main.py (1)
1-225: Install Black, add it to dev dependencies, and format main.py
Add black to the project’s development dependencies (e.g. in requirements-dev.txt or pyproject.toml) so it’s available in CI.
Run
black components/backends/mocker/src/dynamo/mocker/main.py
and commit the resulting changes.
Ruff fixes have been applied; verify no lint errors remain.
tests/router/test_router_e2e_with_mockers.py (1)

1-1230: Require Black installation or alternative invocation

The standalone black command wasn’t found (error: black: command not found), so formatting didn’t run. Add Black to the project’s dev dependencies or update the CI/dev instructions to invoke it via python -m black, then format tests/router/test_router_e2e_with_mockers.py.

Ruff fixes have been applied successfully (no remaining lint issues).

🧹 Nitpick comments (2)

components/backends/mocker/README.md (1)
14-14: Consider removing trailing colons from headings

The trailing colons in headings violate Markdown best practices (MD026). While this is a minor issue, removing them would improve consistency with standard Markdown formatting.
-### Required arguments:
+### Required arguments
-### MockEngineArgs parameters (vLLM-style):
+### MockEngineArgs parameters (vLLM-style)
-### Example with individual arguments (vLLM-style):
+### Example with individual arguments (vLLM-style)
-### Legacy JSON file support:
+### Legacy JSON file support
Also applies to: 17-17, 28-28, 43-43
components/backends/mocker/src/dynamo/mocker/main.py (1)

29-66: Consider using dataclasses for MockEngineArgs serialization

The current implementation manually checks each attribute and builds the dictionary. Consider using a dataclass or Pydantic model for MockEngineArgs to make the serialization more maintainable and type-safe.

Would you like me to generate a dataclass-based implementation that would make this code more maintainable and reduce the repetitive hasattr checks?

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between eb317cd and 20f4ddc.

📒 Files selected for processing (3)

components/backends/mocker/README.md (1 hunks)
components/backends/mocker/src/dynamo/mocker/main.py (3 hunks)
tests/router/test_router_e2e_with_mockers.py (11 hunks)

🧰 Additional context used

🧠 Learnings (1)

📓 Common learnings

Learnt from: PeaBrane
PR: ai-dynamo/dynamo#2756
File: lib/llm/src/kv_router/subscriber.rs:36-44
Timestamp: 2025-08-29T10:03:48.330Z
Learning: PeaBrane prefers to keep PRs contained in scope and is willing to defer technical improvements to future PRs when the current implementation works for the immediate use case. They acknowledge technical debt but prioritize deliverability over completeness in individual PRs.

🧬 Code graph analysis (2)

tests/router/test_router_e2e_with_mockers.py (1)

tests/conftest.py (1)

logger (124-135)

components/backends/mocker/src/dynamo/mocker/main.py (1)

lib/llm/src/local_model.rs (6)

extra_engine_args (174-177)

model_path (93-96)

model_name (98-101)

endpoint_id (108-111)

endpoint_id (390-392)

default (67-89)

🪛 GitHub Actions: Pre Merge Validation of (ai-dynamo/dynamo/refs/pull/2853/merge) by PeaBrane.

tests/router/test_router_e2e_with_mockers.py

[error] 1-1: Black formatting changes detected. Run 'black' to reformat this file.

[error] 1-1: Ruff linting issues auto-fixed in this file during pre-commit.

components/backends/mocker/src/dynamo/mocker/main.py

[error] 1-1: Black formatting changes detected. Run 'black' to reformat this file.

[error] 1-1: Ruff linting issues auto-fixed in this file during pre-commit.

🪛 LanguageTool

components/backends/mocker/README.md

[style] ~12-~12: This phrase is redundant (‘I’ stands for ‘interface’). Use simply “CLI”.
Context: ...mocker engine now supports a vLLM-style CLI interface with individual arguments for all confi...

(ACRONYM_TAUTOLOGY)

[grammar] ~14-~14: There might be a mistake here.
Context: ...ration options. ### Required arguments: - --model-path: Path to model directory or HuggingFace...

(QB_NEW_EN)

[grammar] ~17-~17: There might be a mistake here.
Context: ... MockEngineArgs parameters (vLLM-style): - --num-gpu-blocks-override: Number of GPU blocks for KV cache (def...

(QB_NEW_EN)

[grammar] ~18-~18: There might be a mistake here.
Context: ...GPU blocks for KV cache (default: 16384) - --block-size: Token block size for KV cache blocks (...

(QB_NEW_EN)

[grammar] ~19-~19: There might be a mistake here.
Context: ...k size for KV cache blocks (default: 64) - --max-num-seqs: Maximum number of sequences per iterat...

(QB_NEW_EN)

[grammar] ~20-~20: There might be a mistake here.
Context: ...f sequences per iteration (default: 256) - --max-num-batched-tokens: Maximum number of batched tokens per i...

(QB_NEW_EN)

[grammar] ~23-~23: There might be a mistake here.
Context: .../disable chunked prefill (default: True) - --watermark: KV cache watermark threshold as a frac...

(QB_NEW_EN)

[grammar] ~24-~24: There might be a mistake here.
Context: ... threshold as a fraction (default: 0.01) - --speedup-ratio: Speed multiplier for token generation ...

(QB_NEW_EN)

[grammar] ~43-~43: There might be a mistake here.
Context: ... 8080 ``` ### Legacy JSON file support: For backward compatibility, you can stil...

(QB_NEW_EN)

🪛 markdownlint-cli2 (0.17.2)

components/backends/mocker/README.md

14-14: Trailing punctuation in heading
Punctuation: ':'

(MD026, no-trailing-punctuation)

17-17: Trailing punctuation in heading
Punctuation: ':'

(MD026, no-trailing-punctuation)

28-28: Trailing punctuation in heading
Punctuation: ':'

(MD026, no-trailing-punctuation)

43-43: Trailing punctuation in heading
Punctuation: ':'

(MD026, no-trailing-punctuation)

🪛 Ruff (0.12.2)

components/backends/mocker/src/dynamo/mocker/main.py

80-80: f-string without any placeholders

Remove extraneous f prefix

(F541)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Build and Test - vllm
GitHub Check: Build and Test - dynamo

🔇 Additional comments (5)

components/backends/mocker/README.md (1)

10-12: Documentation is clear and well-structured

The transition to a vLLM-style CLI interface is well documented with clear examples showing both the new individual argument approach and legacy JSON file support for backward compatibility.

components/backends/mocker/src/dynamo/mocker/main.py (2)

161-173: Boolean flag handling looks correct

The implementation properly handles the vLLM-style boolean flags with both positive and negative forms (--enable-prefix-caching/--no-enable-prefix-caching), correctly setting the destination attribute.

Also applies to: 175-187

82-104: Good resource cleanup implementation

The try/finally block ensures proper cleanup of the temporary file, with appropriate error handling and logging. This prevents resource leaks.

tests/router/test_router_e2e_with_mockers.py (2)

52-101: Clean refactoring to dictionary-based configuration

The transition from file-based to in-memory dictionary configuration is well-implemented. The mapping from dictionary keys to CLI arguments is correct and maintains backward compatibility by defaulting to an empty dictionary when mocker_args is None.

322-324: Good test coverage for new CLI interface

All test methods have been properly updated to use the new dictionary-based mocker_args configuration. The tests appropriately cover different scenarios including:

Standard configuration with speedup_ratio and block_size

Resource-limited configuration for overload testing

Multiple mocker instances

Also applies to: 377-379, 444-449, 609-611, 833-835

components/backends/mocker/src/dynamo/mocker/main.py

Signed-off-by: PeaBrane <yanrpei@gmail.com>

alec-flowers

Awesome!

Signed-off-by: PeaBrane <yanrpei@gmail.com>

…ontend port to 8000 (#2853) Signed-off-by: PeaBrane <yanrpei@gmail.com>

…ontend port to 8000 (#2853) Signed-off-by: PeaBrane <yanrpei@gmail.com> Signed-off-by: nnshah1 <neelays@nvidia.com>

first commit

20f4ddc

Signed-off-by: PeaBrane <yanrpei@gmail.com>

PeaBrane requested review from a team, GuanLuo, alec-flowers, biswapanda, grahamking, hhzhang16, ishandhanani, kkranen, nnshah1, paulhendricks, piotrm-nvidia, ptarasiewiczNV, rmccorm4, ryanolson, tanmayv25, tedzhouhk and tmonty12 as code owners September 3, 2025 23:59

pull-request-size bot added the size/L label Sep 3, 2025

github-actions bot added the chore label Sep 3, 2025

black

8ac10ed

Signed-off-by: PeaBrane <yanrpei@gmail.com>

copy-pr-bot bot temporarily deployed to GITLAB September 4, 2025 00:03 Inactive

coderabbitai bot reviewed Sep 4, 2025

View reviewed changes

components/backends/mocker/src/dynamo/mocker/main.py Outdated Show resolved Hide resolved

more concse args extraction

64a3361

Signed-off-by: PeaBrane <yanrpei@gmail.com>

copy-pr-bot bot temporarily deployed to GITLAB September 4, 2025 00:09 Inactive

black + ruff

bbb3d9d

Signed-off-by: PeaBrane <yanrpei@gmail.com>

copy-pr-bot bot temporarily deployed to GITLAB September 4, 2025 00:10 Inactive

copy-pr-bot bot temporarily deployed to GITLAB September 4, 2025 00:15 Inactive

change default frontend http port to 8000

c74772e

Signed-off-by: PeaBrane <yanrpei@gmail.com>

copy-pr-bot bot temporarily deployed to GITLAB September 4, 2025 00:42 Inactive

copy-pr-bot bot temporarily deployed to GITLAB September 4, 2025 00:43 Inactive

tedzhouhk approved these changes Sep 4, 2025

View reviewed changes

PeaBrane enabled auto-merge (squash) September 4, 2025 00:46

PeaBrane disabled auto-merge September 4, 2025 00:51

PeaBrane changed the title ~~chore: pass in mocker engine args directly in python cli~~ chore: pass in mocker engine args directly in python cli + default frontend port to 8000 Sep 4, 2025

update tests to use port 8000

964d57b

Signed-off-by: PeaBrane <yanrpei@gmail.com>

copy-pr-bot bot temporarily deployed to GITLAB September 4, 2025 00:59 Inactive

copy-pr-bot bot temporarily deployed to GITLAB September 4, 2025 01:00 Inactive

fix mocker args extraction

09591ea

Signed-off-by: PeaBrane <yanrpei@gmail.com>

copy-pr-bot bot temporarily deployed to GITLAB September 4, 2025 02:36 Inactive

PeaBrane enabled auto-merge (squash) September 4, 2025 02:37

copy-pr-bot bot temporarily deployed to GITLAB September 4, 2025 02:38 Inactive

alec-flowers approved these changes Sep 4, 2025

View reviewed changes

PeaBrane disabled auto-merge September 4, 2025 03:07

add mock startup time

f9c6d5f

Signed-off-by: PeaBrane <yanrpei@gmail.com>

PeaBrane requested a review from a team as a code owner September 4, 2025 03:52

copy-pr-bot bot temporarily deployed to GITLAB September 4, 2025 03:52 Inactive

PeaBrane enabled auto-merge (squash) September 4, 2025 03:52

copy-pr-bot bot temporarily deployed to GITLAB September 4, 2025 03:52 Inactive

PeaBrane merged commit 3c7c1d6 into main Sep 4, 2025
14 of 15 checks passed

PeaBrane deleted the rupei/clean-mocker-cli branch September 4, 2025 04:26

PeaBrane mentioned this pull request Sep 4, 2025

docs: change docs to default port 8000 #2876

Merged

dillon-cullinan pushed a commit that referenced this pull request Sep 5, 2025

chore: pass in mocker engine args directly in python cli + default fr…

e728ab8

…ontend port to 8000 (#2853) Signed-off-by: PeaBrane <yanrpei@gmail.com>

nnshah1 pushed a commit that referenced this pull request Sep 8, 2025

chore: pass in mocker engine args directly in python cli + default fr…

2d796bf

…ontend port to 8000 (#2853) Signed-off-by: PeaBrane <yanrpei@gmail.com> Signed-off-by: nnshah1 <neelays@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore: pass in mocker engine args directly in python cli + default frontend port to 8000 #2853

chore: pass in mocker engine args directly in python cli + default frontend port to 8000 #2853

Uh oh!

PeaBrane commented Sep 3, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Sep 4, 2025

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

alec-flowers left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

chore: pass in mocker engine args directly in python cli + default frontend port to 8000 #2853

chore: pass in mocker engine args directly in python cli + default frontend port to 8000 #2853

Uh oh!

Conversation

PeaBrane commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 4, 2025

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

alec-flowers left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

PeaBrane commented Sep 3, 2025 •

edited

Loading