-
Notifications
You must be signed in to change notification settings - Fork 693
fix(preprocessor): Populate model ID in PreprocessedRequest (for LoRA support) #2397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughPropagates a model identifier and configuration through the request/preprocessing pipeline and local model builder. Adds model to PreprocessedRequest, requires model() on OAIChatLikeRequest, implements it for OAI templates, sets model in preprocessing, and wires a new Flags.model_config into LocalModelBuilder. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant OAIRequest as OAIChatLikeRequest
participant Preprocessor
participant Builder as PreprocessedRequestBuilder
participant Engine
Client->>OAIRequest: Submit request (includes model)
OAIRequest->>Preprocessor: preprocess_request(request)
Preprocessor->>Builder: builder.model(request.model())
Preprocessor->>Builder: set token_ids, stops, annotations, ...
Builder->>Engine: PreprocessedRequest { model, ... }
sequenceDiagram
participant CLI as Flags
participant LMB as LocalModelBuilder
participant Runtime
CLI->>LMB: model_name(...), model_config(flags.model_config)
LMB->>Runtime: build()
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
launch/dynamo-run/src/lib.rs (1)
38-38: LGTM: Wire model_config into LocalModelBuilderPassing flags.model_config.clone() keeps build configuration cohesive with the new model propagation. If ModelConfig is large, consider wrapping in Arc to reduce cloning, but as an Option this is typically fine.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
launch/dynamo-run/src/lib.rs(1 hunks)lib/llm/src/preprocessor.rs(1 hunks)lib/llm/src/preprocessor/prompt.rs(1 hunks)lib/llm/src/preprocessor/prompt/template/oai.rs(2 hunks)lib/llm/src/protocols/common/preprocessor.rs(1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-08-09T06:10:00.214Z
Learnt from: tzulingk
PR: ai-dynamo/dynamo#2389
File: components/metrics/src/main.rs:63-66
Timestamp: 2025-08-09T06:10:00.214Z
Learning: In components/metrics/src/main.rs, the Args struct's model_name field should be a required String (not Option<String>) because the metrics aggregator must be tied to a specific model when collecting metrics for a component/endpoint.
Applied to files:
lib/llm/src/preprocessor/prompt.rs
📚 Learning: 2025-06-24T20:59:35.725Z
Learnt from: ishandhanani
PR: ai-dynamo/dynamo#1626
File: lib/llm/src/preprocessor.rs:238-239
Timestamp: 2025-06-24T20:59:35.725Z
Learning: In lib/llm/src/preprocessor.rs, the `sampling_options` call in the `preprocess_request` method is placed in the common section after the match statement on `request.prompt_input_type()`, meaning it applies to both `PromptInput::Tokens` and `PromptInput::Text` request types.
Applied to files:
lib/llm/src/preprocessor.rs
📚 Learning: 2025-06-16T20:02:54.935Z
Learnt from: PeaBrane
PR: ai-dynamo/dynamo#1236
File: lib/llm/src/mocker/protocols.rs:85-112
Timestamp: 2025-06-16T20:02:54.935Z
Learning: When using derive_builder::Builder macro, the macro generates the builder struct and its methods, but does NOT generate a `builder()` method on the original struct. A manual `impl StructName { pub fn builder() -> StructNameBuilder { StructNameBuilder::default() } }` is required to provide the convenient `StructName::builder()` API pattern.
Applied to files:
lib/llm/src/protocols/common/preprocessor.rs
🔇 Additional comments (6)
lib/llm/src/preprocessor/prompt.rs (1)
61-61: Make themodelaccessor ergonomic; future ergonomic improvements optionalAll current implementors of
OAIChatLikeRequest(in
lib/llm/src/preprocessor/prompt/template/oai.rs) define:fn model(&self) -> String { … }so the API is consistent and correct. Returning a
Stringis fine given its usage today. If, down the road, this method becomes a performance hotspot, you can introduce a non-breaking follow-up to return&strorCow<'_, str>, at the cost of adding lifetimes to the trait.lib/llm/src/preprocessor.rs (2)
156-157: Propagate model early — good placementSetting builder.model(request.model()) before branching on input type ensures the model is always present. This aligns with the PR objective and avoids accidental omissions for token/text paths.
156-157: All PreprocessedRequest builders include.model(...)I searched for every
PreprocessedRequest::builder()andPreprocessedRequestBuilder::default()invocation and confirmed:
- The only
PreprocessedRequest::builder()call lives inlib/llm/src/preprocessor.rsand immediately invokesbuilder.model(request.model())(line 156).- The other builder site at line 308 also calls
builder.model(request.inner.model.clone()).- No other construction paths of
PreprocessedRequestwere found without a.model(…)call.Any missing
.model(...)would fail to compile, so all required paths are covered.— Consider adding unit tests to lock this in:
• For chat/completions, assertpreprocess_request(...).model == request.inner.model
• For token/text inputs (both branches), assertmodelis populated.lib/llm/src/protocols/common/preprocessor.rs (1)
14-16: Required model on PreprocessedRequest — good and consistentAdding pub model: String enforces that every backend-bound request carries the model identifier. This aligns with prior learning that model fields should be required Strings to avoid ambiguity.
Note: Since derive_builder won’t supply a default for this field, all callsites must set it (compile-time enforced). Good choice.
lib/llm/src/preprocessor/prompt/template/oai.rs (2)
28-31: LGTM: model() for NvCreateChatCompletionRequestSimple, correct pass-through via self.inner.model.clone(). Matches the trait expectation.
69-72: LGTM: model() for NvCreateCompletionRequestSame correct pass-through. Consistent with chat implementation.
The backend worker will need to know which model it is serving, at least for LoRA. Add it.
f1d386c to
4c1bc0a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Will the LoRA model name variants be visible under /v1/models? Will backends call register_llm multiple times with each model name variant for each lora?
Yes they will appear as multiple models. From the front-end's point of view, they are different models served by the same endpoint. I'm thinking of implementing it to allow , or Dynamo will turn that into multiple model registrations in etcd, multiple model cards. There will be an optimization that the model cards share the NATS URL to the tokenizer, because that can be quite big so we don't want to upload it to NATS multiple times. |
The backend worker will need to know which model it is serving, at least for LoRA. Add it.
Summary by CodeRabbit