chore(dynamo-run): Refactor to library #1687

grahamking · 2025-06-27T21:07:24Z

Move much of what was in the dynamo-run crate into dynamo-llm so that everyone can use it.

Example usage:

Create a LocalModel:

    let local_model = LocalModelBuilder::default()
	.model_path("Qwen/Qwen3-0.6B")
	.http_port(8080)
	.build().await?;

Make an engine:

    let engine_config = EngineConfig::StaticFull {
	engine: dynamo_engine_mistralrs::make_engine(&local_model).await?,
	model: Box::new(local_model),
    };

Connect it to an input and run it

    dynamo_llm::entrypoint::input::run_input(Input::Http, runtime, engine_config).await?;

For #1647

Summary by CodeRabbit

New Features
- Introduced a flexible builder pattern for local model configuration, allowing advanced customization and easier initialization.
- Added new input modes and unified input handling, supporting interactive chat, HTTP server, batch file, and distributed endpoint modes.
- Centralized engine configuration and routing, enabling more extensible and maintainable engine management.
Refactor
- Simplified and modularized the codebase by moving input and engine logic into dedicated modules.
- Replaced direct model construction with an asynchronous builder for improved clarity and extensibility.
- Streamlined configuration and validation for flags and router settings.
Bug Fixes
- Added validation to prevent incompatible input and output combinations in endpoint and dynamic modes.
Chores
- Updated and reorganized dependencies to reflect new input and configuration features.

coderabbitai · 2025-06-27T21:11:54Z

Walkthrough

This change refactors the Dynamo runner and input handling architecture. It introduces a builder pattern for local model construction, centralizes engine configuration, and migrates input mode management and routing into the core library. Several modules and methods are removed or relocated, and subprocess management is streamlined for clarity and modularity.

Changes

Files/Groups	Change Summary
`launch/dynamo-run/Cargo.toml`, `lib/llm/Cargo.toml`	Moved `humantime` and `dialoguer` dependencies from `dynamo-run` to `llm` crate.
`launch/dynamo-run/src/flags.rs`	Removed `kv_router_config` and `as_vec` methods; added `validate` and `router_config` methods; updated imports.
`launch/dynamo-run/src/input.rs`, `launch/dynamo-run/src/opt.rs`	Removed the entire `input` module and the `Input` enum/impls from CLI.
`launch/dynamo-run/src/lib.rs`	Refactored `run` to use new engine creation and input handling; added helper and async functions for engine/subprocess management.
`launch/dynamo-run/src/main.rs`	Changed `Input` import; added validation to prevent dynamic endpoint for both input and output.
`launch/dynamo-run/src/subprocess.rs`	Updated subprocess start function to remove endpoint parameter and use model endpoint.
`launch/llmctl/src/main.rs`	Switched to `LocalModelBuilder` for model creation.
`lib/llm/src/lib.rs`	Added new public `entrypoint` module.
`lib/llm/src/entrypoint.rs`, `lib/llm/src/entrypoint/input.rs`	Added new modules for engine configuration and input routing; defined `EngineConfig`, `RouterConfig`, and `Input` enum.
`lib/llm/src/entrypoint/input/batch.rs`, `lib/llm/src/entrypoint/input/http.rs`, `lib/llm/src/entrypoint/input/text.rs`	Refactored input handlers to use new configuration and prepared engine structures; simplified signatures.
`lib/llm/src/entrypoint/input/common.rs`	Added `card` and `request_template` to `PreparedEngine`; added `has_tokenizer` method.
`lib/llm/src/entrypoint/input/endpoint.rs`	Updated imports and pattern matching for new engine config structure.
`lib/llm/src/local_model.rs`	Introduced `LocalModelBuilder` for constructing `LocalModel`; added new fields and methods; refactored model preparation.

Sequence Diagram(s)

sequenceDiagram
    participant CLI
    participant LocalModelBuilder
    participant EngineConfig
    participant InputRouter
    participant Runtime

    CLI->>LocalModelBuilder: Configure and build LocalModel
    LocalModelBuilder-->>CLI: Returns LocalModel
    CLI->>EngineConfig: Create engine using LocalModel & flags
    EngineConfig-->>CLI: Returns EngineConfig
    CLI->>InputRouter: Run input handling (with Input, EngineConfig, Runtime)
    InputRouter->>Runtime: Dispatches to specific input handler (http, text, batch, endpoint)
    Runtime-->>InputRouter: Handles requests/responses
    InputRouter-->>CLI: Completes input handling

Possibly related PRs

ai-dynamo/dynamo#1259: Adds and modifies router configuration methods in Flags, directly related to the router config changes in this PR.
ai-dynamo/dynamo#1335: Introduces random internal endpoint generation, which is further extended and refactored in this PR.
ai-dynamo/dynamo#1328: Adjusts default engine selection logic, which is refactored and expanded upon in this PR.

Poem

A builder appears with a hop and a bound,
Models and engines now neatly unbound.
Inputs rerouted, subprocesses clear,
The codebase is lighter—let’s all give a cheer!
With rabbits and runners and endpoints anew,
Dynamo’s more nimble, and faster too!
🐇✨

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (1)

launch/dynamo-run/src/lib.rs (1)
48-51: Improve error handling for endpoint parsing.

The endpoint parsing on line 50 uses parse()? which could fail with a generic error. Consider providing a more specific error message.
-        builder.endpoint_id(path.parse()?);
+        builder.endpoint_id(path.parse()
+            .map_err(|e| anyhow::anyhow!("Failed to parse endpoint path '{}': {}", path, e))?);

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 68d7461 and 8b2c336.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (18)

launch/dynamo-run/Cargo.toml (0 hunks)
launch/dynamo-run/src/flags.rs (2 hunks)
launch/dynamo-run/src/input.rs (0 hunks)
launch/dynamo-run/src/lib.rs (3 hunks)
launch/dynamo-run/src/main.rs (2 hunks)
launch/dynamo-run/src/opt.rs (1 hunks)
launch/dynamo-run/src/subprocess.rs (1 hunks)
launch/llmctl/src/main.rs (2 hunks)
lib/llm/Cargo.toml (2 hunks)
lib/llm/src/entrypoint.rs (1 hunks)
lib/llm/src/entrypoint/input.rs (1 hunks)
lib/llm/src/entrypoint/input/batch.rs (5 hunks)
lib/llm/src/entrypoint/input/common.rs (7 hunks)
lib/llm/src/entrypoint/input/endpoint.rs (3 hunks)
lib/llm/src/entrypoint/input/http.rs (2 hunks)
lib/llm/src/entrypoint/input/text.rs (1 hunks)
lib/llm/src/lib.rs (1 hunks)
lib/llm/src/local_model.rs (5 hunks)

💤 Files with no reviewable changes (2)

launch/dynamo-run/Cargo.toml
launch/dynamo-run/src/input.rs

🧰 Additional context used

🧠 Learnings (11)

📓 Common learnings

Learnt from: biswapanda
PR: ai-dynamo/dynamo#1412
File: lib/bindings/python/src/dynamo/runtime/logging.py:100-100
Timestamp: 2025-06-06T21:48:35.214Z
Learning: In the Dynamo codebase, BentoML has been completely removed from all executable code, with only documentation and attribution references remaining. The error_loggers configuration in lib/bindings/python/src/dynamo/runtime/logging.py should not include "bentoml" since those modules no longer exist.

lib/llm/src/entrypoint/input/endpoint.rs (1)

Learnt from: ryanolson
PR: ai-dynamo/dynamo#1093
File: lib/llm/src/block_manager/block/registry.rs:98-122
Timestamp: 2025-05-29T06:20:12.901Z
Learning: In lib/llm/src/block_manager/block/registry.rs, the background task spawned for handling unregister notifications uses detached concurrency by design. The JoinHandle is intentionally not stored as this represents a reasonable architectural tradeoff for a long-running cleanup task.

lib/llm/src/entrypoint/input/text.rs (1)

Learnt from: ishandhanani
PR: ai-dynamo/dynamo#1626
File: lib/llm/src/preprocessor.rs:238-239
Timestamp: 2025-06-24T20:59:35.725Z
Learning: In lib/llm/src/preprocessor.rs, the `sampling_options` call in the `preprocess_request` method is placed in the common section after the match statement on `request.prompt_input_type()`, meaning it applies to both `PromptInput::Tokens` and `PromptInput::Text` request types.

launch/dynamo-run/src/opt.rs (1)

Learnt from: ishandhanani
PR: ai-dynamo/dynamo#1626
File: lib/llm/src/preprocessor.rs:238-239
Timestamp: 2025-06-24T20:59:35.725Z
Learning: In lib/llm/src/preprocessor.rs, the `sampling_options` call in the `preprocess_request` method is placed in the common section after the match statement on `request.prompt_input_type()`, meaning it applies to both `PromptInput::Tokens` and `PromptInput::Text` request types.

lib/llm/src/entrypoint/input/batch.rs (1)

Learnt from: ishandhanani
PR: ai-dynamo/dynamo#1626
File: lib/llm/src/preprocessor.rs:238-239
Timestamp: 2025-06-24T20:59:35.725Z
Learning: In lib/llm/src/preprocessor.rs, the `sampling_options` call in the `preprocess_request` method is placed in the common section after the match statement on `request.prompt_input_type()`, meaning it applies to both `PromptInput::Tokens` and `PromptInput::Text` request types.

lib/llm/src/entrypoint/input/common.rs (1)

Learnt from: ishandhanani
PR: ai-dynamo/dynamo#1626
File: lib/llm/src/preprocessor.rs:238-239
Timestamp: 2025-06-24T20:59:35.725Z
Learning: In lib/llm/src/preprocessor.rs, the `sampling_options` call in the `preprocess_request` method is placed in the common section after the match statement on `request.prompt_input_type()`, meaning it applies to both `PromptInput::Tokens` and `PromptInput::Text` request types.

launch/dynamo-run/src/subprocess.rs (1)

Learnt from: PeaBrane
PR: ai-dynamo/dynamo#1285
File: lib/llm/src/kv_router/scoring.rs:58-63
Timestamp: 2025-05-30T06:38:09.630Z
Learning: In lib/llm/src/kv_router/scoring.rs, the user prefers to keep the panic behavior when calculating load_avg and variance with empty endpoints rather than adding guards for division by zero. They want the code to fail fast on this error condition.

lib/llm/src/entrypoint/input.rs (1)

Learnt from: ishandhanani
PR: ai-dynamo/dynamo#1626
File: lib/llm/src/preprocessor.rs:238-239
Timestamp: 2025-06-24T20:59:35.725Z
Learning: In lib/llm/src/preprocessor.rs, the `sampling_options` call in the `preprocess_request` method is placed in the common section after the match statement on `request.prompt_input_type()`, meaning it applies to both `PromptInput::Tokens` and `PromptInput::Text` request types.

launch/dynamo-run/src/flags.rs (1)

Learnt from: PeaBrane
PR: ai-dynamo/dynamo#1285
File: lib/llm/src/kv_router/scoring.rs:58-63
Timestamp: 2025-05-30T06:38:09.630Z
Learning: In lib/llm/src/kv_router/scoring.rs, the user prefers to keep the panic behavior when calculating load_avg and variance with empty endpoints rather than adding guards for division by zero. They want the code to fail fast on this error condition.

launch/dynamo-run/src/lib.rs (4)

Learnt from: oandreeva-nv
PR: ai-dynamo/dynamo#1195
File: lib/llm/tests/block_manager.rs:150-152
Timestamp: 2025-06-02T19:37:27.666Z
Learning: In Rust/Tokio applications, when background tasks use channels for communication, dropping the sender automatically signals task termination when the receiver gets `None`. The `start_batching_publisher` function in `lib/llm/tests/block_manager.rs` demonstrates this pattern: when the `KVBMDynamoRuntimeComponent` is dropped, its `batch_tx` sender is dropped, causing `rx.recv()` to return `None`, which triggers cleanup and task termination.

Learnt from: ryanolson
PR: ai-dynamo/dynamo#1093
File: lib/llm/src/block_manager/block/registry.rs:98-122
Timestamp: 2025-05-29T06:20:12.901Z
Learning: In lib/llm/src/block_manager/block/registry.rs, the background task spawned for handling unregister notifications uses detached concurrency by design. The JoinHandle is intentionally not stored as this represents a reasonable architectural tradeoff for a long-running cleanup task.

Learnt from: PeaBrane
PR: ai-dynamo/dynamo#1392
File: lib/llm/src/kv_router/scoring.rs:35-46
Timestamp: 2025-06-05T01:02:15.318Z
Learning: In lib/llm/src/kv_router/scoring.rs, PeaBrane prefers panic-based early failure over Result-based error handling for the worker_id() method to catch invalid data early during development.

Learnt from: jthomson04
PR: ai-dynamo/dynamo#1429
File: lib/runtime/src/utils/leader_worker_barrier.rs:69-72
Timestamp: 2025-06-08T03:12:03.985Z
Learning: In the leader-worker barrier implementation in lib/runtime/src/utils/leader_worker_barrier.rs, the `wait_for_key_count` function correctly uses exact equality (`==`) instead of greater-than-or-equal (`>=`) because worker IDs must be unique (enforced by etcd create-only operations), ensuring exactly the expected number of workers can register.

lib/llm/src/local_model.rs (1)

Learnt from: PeaBrane
PR: ai-dynamo/dynamo#1236
File: lib/llm/src/mocker/protocols.rs:85-112
Timestamp: 2025-06-16T20:02:54.935Z
Learning: When using derive_builder::Builder macro, the macro generates the builder struct and its methods, but does NOT generate a `builder()` method on the original struct. A manual `impl StructName { pub fn builder() -> StructNameBuilder { StructNameBuilder::default() } }` is required to provide the convenient `StructName::builder()` API pattern.

🧬 Code Graph Analysis (5)

launch/llmctl/src/main.rs (1)

lib/llm/src/local_model.rs (2)

default (52-64)

model_name (73-76)

lib/llm/src/entrypoint/input/text.rs (2)

lib/llm/src/local_model.rs (2)

request_template (115-120)

request_template (244-246)

lib/llm/src/entrypoint/input/common.rs (1)

prepare_engine (49-123)

lib/llm/src/entrypoint/input/batch.rs (1)

lib/llm/src/entrypoint/input/common.rs (1)

prepare_engine (49-123)

lib/llm/src/entrypoint/input/common.rs (4)

lib/llm/src/entrypoint/input/http.rs (2)

common (66-69)

common (73-76)

lib/llm/src/local_model.rs (4)

request_template (115-120)

request_template (244-246)

card (228-230)

service_name (240-242)

lib/llm/src/entrypoint.rs (2)

local_model (52-59)

new (26-31)

lib/llm/src/engines.rs (1)

make_engine_core (84-86)

launch/dynamo-run/src/subprocess.rs (2)

lib/llm/src/local_model.rs (2)

endpoint_id (83-86)

endpoint_id (263-265)

lib/runtime/src/protocols.rs (1)

as_url (178-183)

⏰ Context from checks skipped due to timeout of 90000ms (2)

GitHub Check: Mirror Repository to GitLab
GitHub Check: Build and Test - vllm

🔇 Additional comments (30)

lib/llm/src/lib.rs (1)

18-18: LGTM: Clean module declaration

The addition of the entrypoint module declaration properly exposes the new functionality as part of the public API and maintains alphabetical ordering.

lib/llm/src/entrypoint/input/endpoint.rs (3)

6-17: LGTM: Proper relative imports

The change from dynamo_llm to crate imports is correct now that this code resides within the dynamo-llm crate.

24-24: LGTM: Consistent relative import

The update to use crate::entrypoint::EngineConfig maintains consistency with the other relative imports.

82-82: LGTM: Updated pattern matching for new enum structure

The change from EngineConfig::Dynamic to EngineConfig::Dynamic(_) correctly handles the updated enum variant that now contains data.

launch/llmctl/src/main.rs (2)

9-9: LGTM: Updated import for builder pattern

The import change from LocalModel to LocalModelBuilder aligns with the new builder pattern approach.

230-233: LGTM: Proper builder pattern implementation

The refactoring from LocalModel::with_name_only(&model_name) to the builder pattern is correctly implemented:

Uses LocalModelBuilder::default() for initialization

Properly sets the model name with model_name(Some(model_name))

Calls .build().await? for asynchronous construction with proper error handling

lib/llm/Cargo.toml (2)

56-56: LGTM: Added workspace dependency for batch functionality

The addition of humantime as a workspace dependency with clear commenting supports the new input/batch processing features.

84-85: LGTM: Well-configured dependency for interactive input

The dialoguer dependency is properly configured with:

Disabled default features to minimize bloat

Enabled "editor" and "history" features for rich interactive text input

Clear section commenting indicating its purpose

launch/dynamo-run/src/subprocess.rs (2)

21-30: LGTM: Simplified function signature with better encapsulation

The removal of the endpoint: &EndpointId parameter simplifies the function interface by leveraging the endpoint ID now encapsulated within the LocalModel.

40-40: LGTM: Consistent use of encapsulated endpoint

The change from endpoint.as_url() to local_model.endpoint_id().as_url() correctly uses the endpoint ID from the LocalModel, maintaining the same functionality while improving encapsulation.

launch/dynamo-run/src/opt.rs (1)

1-113: Clean refactoring: Input enum successfully migrated to library crate.

The removal of the Input enum and its associated implementations is correctly executed as part of the migration to dynamo_llm::entrypoint::input. The remaining Output enum and its implementations are intact and functional.

launch/dynamo-run/src/main.rs (3)

20-21: Import update correctly reflects the refactoring.

The import change from local dynamo_run::{Input, Output} to dynamo_llm::entrypoint::input::Input and dynamo_run::Output properly reflects the migration of Input to the library crate.

131-133: Validation logic prevents incompatible input/output combinations.

The validation correctly prevents using endpoint input (Input::Endpoint) with dynamic output (Output::Dynamic) simultaneously, which could lead to conflicting configurations.

138-144: Helper functions are correctly implemented.

Both is_in_dynamic and is_out_dynamic functions use appropriate pattern matching to identify the respective dynamic configurations. The logic is sound and the functions are well-named.

lib/llm/src/entrypoint/input/text.rs (2)

4-14: Import reorganization improves code organization.

The imports have been correctly updated to use crate-relative paths (crate::) and are well-organized with related imports grouped together.

20-37: Function refactoring simplifies parameter passing.

The removal of unused _flags and template parameters and the direct use of prepared_engine.request_template eliminates redundant parameter passing. The TODO comment appropriately identifies a potential future improvement to pass the entire prepared_engine to main_loop.

lib/llm/src/entrypoint/input/batch.rs (3)

4-22: Import reorganization follows consistent pattern.

The imports are properly reorganized with crate-local imports grouped before external dependencies, consistent with the pattern used in other refactored files.

67-73: Correct ownership handling for card consumption.

Using prepared_engine.card.take() is the correct approach here since the OpenAIPreprocessor::new() method likely takes ownership of the card. The has_tokenizer() method provides a clean way to check if preprocessing is available.

97-97: Template handling simplified through prepared engine.

Converting prepared_engine.request_template to Option<Arc<RequestTemplate>> correctly handles the optional template from the prepared engine and makes it shareable across async tasks.

lib/llm/src/entrypoint/input/http.rs (3)

6-21: Import reorganization and grouping improves readability.

The imports are well-organized with related crate imports grouped together and external dependencies clearly separated.

24-31: Configuration access centralized through engine_config.

The HTTP service configuration now correctly accesses port and request template directly from engine_config.local_model(), eliminating the need for separate parameters and centralizing configuration access.

33-46: Router configuration properly extracted from engine config.

The match arm correctly handles the EngineConfig::Dynamic(_) variant and extracts router configuration from engine_config.local_model().router_config(). The router mode and KV router config are properly passed to run_watcher.
lib/llm/src/entrypoint.rs (1)

52-52: Consider making local_model() public if needed by external modules.

The local_model() method is currently private. If this method needs to be accessed from outside the module (which seems likely given its utility), consider making it public.
#!/bin/bash
# Check if local_model() is used outside the entrypoint module
rg -A 2 "\.local_model\(\)" --type rust
launch/dynamo-run/src/lib.rs (1)

21-82: Well-structured refactoring of the run function.

The refactoring significantly improves code organization by:

Using the builder pattern for LocalModel construction

Extracting engine creation logic into a dedicated function

Centralizing input handling through the dynamo_llm library

Properly handling subprocess cleanup with the extra future

This makes the code more maintainable and follows the single responsibility principle.
lib/llm/src/local_model.rs (6)

8-10: LGTM!

The new imports are appropriate for the builder pattern implementation and new model configuration features.

Also applies to: 18-18, 21-21

33-38: LGTM!

Good choice of default values with clear explanatory comments.

39-65: LGTM!

Well-structured builder pattern implementation with appropriate defaults.

217-265: LGTM!

Clean struct definition with appropriate accessor methods following Rust conventions.

335-343: LGTM!

Good approach using UUID for generating unique internal endpoints. The comment clearly explains the rationale.

109-112: Router config should not be optional if always required

The router_config field is an Option in the builder, but the build() method expects it to always be present. This design is inconsistent and could lead to runtime panics.

Either:

Make router_config a required parameter in the builder constructor, or

Provide a sensible default in the build() method instead of panicking
-                router_config: self
-                    .router_config
-                    .take()
-                    .expect("unreachable, RouterConfig missing"),
+                router_config: self
+                    .router_config
+                    .take()
+                    .unwrap_or_default(),
Also applies to: 150-155, 209-213
⛔ Skipped due to learnings
Learnt from: PeaBrane
PR: ai-dynamo/dynamo#1285
File: lib/llm/src/kv_router/scoring.rs:58-63
Timestamp: 2025-05-30T06:38:09.630Z
Learning: In lib/llm/src/kv_router/scoring.rs, the user prefers to keep the panic behavior when calculating load_avg and variance with empty endpoints rather than adding guards for division by zero. They want the code to fail fast on this error condition.
Learnt from: PeaBrane
PR: ai-dynamo/dynamo#1392
File: lib/llm/src/kv_router/scoring.rs:35-46
Timestamp: 2025-06-05T01:02:15.318Z
Learning: In lib/llm/src/kv_router/scoring.rs, PeaBrane prefers panic-based early failure over Result-based error handling for the worker_id() method to catch invalid data early during development.
Learnt from: PeaBrane
PR: ai-dynamo/dynamo#1236
File: lib/llm/src/mocker/protocols.rs:85-112
Timestamp: 2025-06-16T20:02:54.935Z
Learning: When using derive_builder::Builder macro, the macro generates the builder struct and its methods, but does NOT generate a `builder()` method on the original struct. A manual `impl StructName { pub fn builder() -> StructNameBuilder { StructNameBuilder::default() } }` is required to provide the convenient `StructName::builder()` API pattern.
Learnt from: PeaBrane
PR: ai-dynamo/dynamo#1285
File: lib/llm/src/kv_router/scheduler.rs:260-266
Timestamp: 2025-05-30T06:34:12.785Z
Learning: In the KV router scheduler code, PeaBrane prefers fail-fast behavior over silent failure handling. When accessing worker metrics data that could be out-of-bounds (like dp_rank indexing), explicit panics are preferred over graceful degradation with continue statements to ensure data integrity issues are caught early.
Learnt from: alec-flowers
PR: ai-dynamo/dynamo#1181
File: lib/llm/src/kv_router/publisher.rs:379-425
Timestamp: 2025-05-29T00:02:35.018Z
Learning: In lib/llm/src/kv_router/publisher.rs, the functions `create_stored_blocks` and `create_stored_block_from_parts` are correctly implemented and not problematic duplications of existing functionality elsewhere in the codebase.

lib/llm/src/entrypoint/input/common.rs

lib/llm/src/entrypoint/input.rs

launch/dynamo-run/src/flags.rs

lib/llm/src/local_model.rs

paulhendricks

Great work! Left a few little comments.

lib/llm/src/entrypoint.rs

lib/llm/src/local_model.rs

launch/dynamo-run/src/lib.rs

Move much of what was in the `dynamo-run` crate into `dynamo-llm` so that everyone can use it. Example usage: 1. Create a `LocalModel`: ``` let local_model = LocalModelBuilder::default() .model_path("Qwen/Qwen3-0.6B") .http_port(8080) .build().await?; ``` 2. Make an engine: ``` let engine_config = EngineConfig::StaticFull { engine: dynamo_engine_mistralrs::make_engine(&local_model).await?, model: Box::new(local_model), }; ``` 3. Connect it to an input and run it ``` dynamo_llm::entrypoint::input::run_input(Input::Http, runtime, engine_config).await?; ``` For #1647

And fix bindings, update lock files.

They were usize because that's easiest in Rust, but naturally they are u32, so put in the extra work to do it right. The nice part is at the boundary (mistralrs, gguf, etc) they were already u32, suggesting this is indeed the correct type.

Move much of what was in the `dynamo-run` crate into `dynamo-llm` so that everyone can use it. Example usage: 1. Create a `LocalModel`: ``` let local_model = LocalModelBuilder::default() .model_path("Qwen/Qwen3-0.6B") .http_port(8080) .build().await?; ``` 2. Make an engine: ``` let engine_config = EngineConfig::StaticFull { engine: dynamo_engine_mistralrs::make_engine(&local_model).await?, model: Box::new(local_model), }; ``` 3. Connect it to an input and run it ``` dynamo_llm::entrypoint::input::run_input(Input::Http, runtime, engine_config).await?; ``` For #1647 Code Rabbit summary, thanks: * Introduced a flexible builder pattern for local model configuration, allowing advanced customization and easier initialization. * Added new input modes and unified input handling, supporting interactive chat, HTTP server, batch file, and distributed endpoint modes. * Centralized engine configuration and routing, enabling more extensible and maintainable engine management. * Simplified and modularized the codebase by moving input and engine logic into dedicated modules. * Replaced direct model construction with an asynchronous builder for improved clarity and extensibility. * Streamlined configuration and validation for flags and router settings. * Added validation to prevent incompatible input and output combinations in endpoint and dynamic modes.

grahamking requested review from a team, GuanLuo, PeaBrane, alec-flowers, biswapanda, jthomson04, kkranen, oandreeva-nv, paulhendricks, rmccorm4, ryanolson and tmonty12 as code owners June 27, 2025 21:07

pull-request-size bot added the size/XXL label Jun 27, 2025

copy-pr-bot bot temporarily deployed to GITLAB June 27, 2025 21:07 Inactive

grahamking changed the title ~~dynamo-run: Refactor to library~~ chore(dynamo-run): Refactor to library Jun 27, 2025

github-actions bot added the chore label Jun 27, 2025

grahamking marked this pull request as draft June 27, 2025 21:08

grahamking self-assigned this Jun 27, 2025

coderabbitai bot reviewed Jun 27, 2025

View reviewed changes

copy-pr-bot bot temporarily deployed to GITLAB June 27, 2025 21:13 Inactive

copy-pr-bot bot temporarily deployed to GITLAB June 27, 2025 21:45 Inactive

copy-pr-bot bot temporarily deployed to GITLAB June 27, 2025 21:46 Inactive

grahamking marked this pull request as ready for review June 27, 2025 22:13

grahamking force-pushed the gk-dr-split-1 branch from 9b7df16 to 03eb709 Compare June 30, 2025 13:30

copy-pr-bot bot temporarily deployed to GITLAB June 30, 2025 13:30 Inactive

copy-pr-bot bot temporarily deployed to GITLAB June 30, 2025 13:31 Inactive

paulhendricks approved these changes Jun 30, 2025

View reviewed changes

lib/llm/src/entrypoint.rs Show resolved Hide resolved

lib/llm/src/local_model.rs Show resolved Hide resolved

launch/dynamo-run/src/lib.rs Show resolved Hide resolved

copy-pr-bot bot temporarily deployed to GITLAB June 30, 2025 15:57 Inactive

grahamking force-pushed the gk-dr-split-1 branch from 479c38a to b69fd0d Compare June 30, 2025 17:58

copy-pr-bot bot temporarily deployed to GITLAB June 30, 2025 17:58 Inactive

copy-pr-bot bot temporarily deployed to GITLAB June 30, 2025 18:03 Inactive

grahamking force-pushed the gk-dr-split-1 branch from b69fd0d to 3a1ef41 Compare June 30, 2025 18:28

copy-pr-bot bot temporarily deployed to GITLAB June 30, 2025 18:28 Inactive

grahamking added 3 commits June 30, 2025 15:03

fix: Code Rabbit is wonderful

7d09546

And fix bindings, update lock files.

grahamking force-pushed the gk-dr-split-1 branch from 3a1ef41 to 6f7fd3e Compare June 30, 2025 19:03

copy-pr-bot bot temporarily deployed to GITLAB June 30, 2025 19:03 Inactive

grahamking force-pushed the gk-dr-split-1 branch from 6f7fd3e to ae407e1 Compare June 30, 2025 19:33

copy-pr-bot bot temporarily deployed to GITLAB June 30, 2025 19:33 Inactive

Fix types in Python bindings for kv_cache_block_size and context_length

0373980

grahamking force-pushed the gk-dr-split-1 branch from ae407e1 to 0373980 Compare June 30, 2025 20:29

copy-pr-bot bot temporarily deployed to GITLAB June 30, 2025 20:29 Inactive

copy-pr-bot bot temporarily deployed to GITLAB June 30, 2025 20:30 Inactive

grahamking merged commit 92f06b0 into main Jun 30, 2025
11 of 12 checks passed

grahamking deleted the gk-dr-split-1 branch June 30, 2025 21:06

alec-flowers mentioned this pull request Jul 1, 2025

fix: default to None initialization of routing config #1713

Merged

coderabbitai bot mentioned this pull request Jul 7, 2025

feat(python): Python bindings for the Dynamo CLI tools #1799

Merged

coderabbitai bot mentioned this pull request Jul 21, 2025

feat: updated devcontainer #2024

Closed

This was referenced Aug 6, 2025

chore: Remove service_name from ModelDeploymentCard #2349

Merged

fix(preprocessor): Populate model ID in PreprocessedRequest (for LoRA support) #2397

Merged

This was referenced Aug 18, 2025

feat(http): TLS support #2492

Merged

feat: KServe gRPC support #2638

Merged

coderabbitai bot mentioned this pull request Sep 2, 2025

refactor: Split ModelType to ModelInput for request and response type; ModelType for the supported workloads #2714

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore(dynamo-run): Refactor to library #1687

chore(dynamo-run): Refactor to library #1687

Uh oh!

grahamking commented Jun 27, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jun 27, 2025

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

paulhendricks left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

chore(dynamo-run): Refactor to library #1687

chore(dynamo-run): Refactor to library #1687

Uh oh!

Conversation

grahamking commented Jun 27, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jun 27, 2025

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

paulhendricks left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

grahamking commented Jun 27, 2025 •

edited by coderabbitai bot

Loading