Skip to content

Conversation

@paulhendricks
Copy link
Member

@paulhendricks paulhendricks commented Aug 25, 2025

Overview:

This PR replaces all usage of std::sync::Mutex within ModelManager with parking_lot::Mutex.

Details:

Using this in favor of explicit error handling in #2678.

Motivation

  • Performance: parking_lot::Mutex offers significantly faster lock/unlock performance with lower contention overhead compared to std::sync::Mutex.
  • Ergonomics: Unlike std::sync::Mutex, parking_lot::Mutex does not require .unwrap() when acquiring a lock, removing the possibility of poisoned lock panics and simplifying code paths.

Where should the reviewer start?

  • lib/llm/src/discovery/model_manager.rs

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Relates to:

Summary by CodeRabbit

  • Refactor
    • Replaced internal locking with a more efficient, non-poisoning mutex to improve concurrency in model management.
    • Simplified lock handling paths for faster, more reliable access under load.
  • Chores
    • Added a new lightweight synchronization dependency to support improved locking behavior.

No changes to public APIs or user workflows. Users may notice smoother performance and improved stability during concurrent operations.

@paulhendricks paulhendricks requested a review from a team as a code owner August 25, 2025 19:10
@copy-pr-bot
Copy link

copy-pr-bot bot commented Aug 25, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 25, 2025

Walkthrough

Replaced std::sync::Mutex with parking_lot::Mutex in ModelManager for entries and kv_choosers, updating lock calls accordingly. Added parking_lot = 0.12.4 to lib/llm/Cargo.toml. No new features; synchronization primitive and imports adjusted. No method signatures changed; internal field types updated.

Changes

Cohort / File(s) Summary
Dependency update
lib/llm/Cargo.toml
Added direct dependency parking_lot = "0.12.4".
Synchronization refactor (ModelManager)
lib/llm/src/discovery/model_manager.rs
Switched std::sync::Mutex to parking_lot::Mutex for entries and kv_choosers; updated imports; simplified .lock() usage (no unwrap); updated internal access patterns accordingly. Field types for the mutex-wrapped maps reflect the new mutex.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

In burrows of code I hop with delight,
Swapping old locks for new, snug and tight.
A parking_lot gleams, threads queue just so,
No poison, no panic—onward we go! 🐇🔒
Tiny change, tidy lanes—performance in tow.

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
lib/llm/Cargo.toml (1)

67-67: Confirmed parking_lot v0.12.4 — no known advisories; optional workspace‐management & deadlock detection

Verified that parking_lot 0.12.4 is the latest published version on crates.io and there are no OSV advisories against it.

Optional refinements to consider:

  • File lib/llm/Cargo.toml (line 67):
    • Move parking_lot = "0.12.4" into the workspace’s [workspace.dependencies] block, then reference it here as
    parking_lot = { workspace = true }
    for consistent versioning across crates.
  • In this crate’s Cargo.toml, enable the deadlock_detection feature in debug builds—e.g.,
    [profile.dev.package.parking_lot]
    features = ["deadlock_detection"]
    to catch locking issues during development without impacting release‐build performance.
lib/llm/src/discovery/model_manager.rs (2)

4-10: Mutex swap to parking_lot applied correctly; consider consistency with RwLock

Good move to parking_lot::Mutex and the simplified .lock() paths. For consistency and to avoid poisoning panics still present on std::sync::RwLock, consider migrating the three RwLocks to parking_lot::RwLock in a follow-up. This would also remove the remaining .unwrap() calls on .read()/.write().


250-257: Remove extra String clone in get_model_tool_call_parser

clone() on the Option<String> followed by to_string() performs two allocations. Use an as_ref().cloned() chain to clone only once.

Apply this diff:

-        self.entries
-            .lock()
-            .values()
-            .find(|entry| entry.name == model)
-            .and_then(|entry| entry.runtime_config.as_ref())
-            .and_then(|config| config.tool_call_parser.clone())
-            .map(|parser| parser.to_string())
+        self.entries
+            .lock()
+            .values()
+            .find(|entry| entry.name == model)
+            .and_then(|entry| entry.runtime_config.as_ref())
+            .and_then(|config| config.tool_call_parser.as_ref())
+            .cloned()
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a24221d and e2049ec.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (2)
  • lib/llm/Cargo.toml (1 hunks)
  • lib/llm/src/discovery/model_manager.rs (5 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Build and Test - dynamo
  • GitHub Check: pre-merge-rust (lib/bindings/python)
  • GitHub Check: pre-merge-rust (lib/runtime/examples)
  • GitHub Check: pre-merge-rust (.)
🔇 Additional comments (5)
lib/llm/src/discovery/model_manager.rs (5)

41-44: Behavioral change: no lock poisoning with parking_lot::Mutex

With parking_lot::Mutex, locks won’t be poisoned on panic. If a panic can occur while mutating entries/kv_choosers, state invariants won’t be guarded by poisoning. If you rely on that signal, document the trade-off or add targeted validation on next lock acquisition.


63-65: LGTM — concise, lock held for minimal scope

get_model_entries clones while holding the mutex only for the iteration. Reasonable for the expected access pattern.


173-174: LGTM — lock usage updated cleanly

The switch to parking_lot::Mutex on save/remove/get paths is applied correctly and keeps the critical section tight.

Also applies to: 178-179, 205-207


4-44: No std::sync::Mutex usage detected; migration complete
The scan confirms there are no lingering std::sync::Mutex references and the use parking_lot::Mutex; import is present in lib/llm/src/discovery/model_manager.rs. Everything looks correctly migrated.


181-203: Prevent duplicate KV routers under concurrent calls

I noticed a race condition: two or more tasks can call kv_chooser_for(…) at the same time, both observe that no router exists for model_name, and each invoke create_kv_chooser, resulting in multiple KvRouter instances (and etcd keys) for the same model. The current insertion in create_kv_chooser happens unconditionally after creation, so it doesn’t guard against races.

Minimal double-checked locking fix in create_kv_chooser:

--- a/lib/llm/src/discovery/model_manager.rs
+++ b/lib/llm/src/discovery/model_manager.rs
@@ pub async fn create_kv_chooser(
-        let new_kv_chooser = Arc::new(chooser);
-        self.kv_choosers
-            .lock()
-            .insert(model_name.to_string(), new_kv_chooser.clone());
-        Ok(new_kv_chooser)
+        let new_kv_chooser = Arc::new(chooser);
+        // Double-check under lock to avoid racing inserts
+        let mut map = self.kv_choosers.lock();
+        if let Some(existing) = map.get(model_name) {
+            // Another task won the race – use that instance
+            return Ok(existing.clone());
+        }
+        map.insert(model_name.to_string(), new_kv_chooser.clone());
+        Ok(new_kv_chooser)

Longer-term, for stronger once-only initialization, consider switching from Mutex<HashMap<…>> to DashMap<String, OnceCell<Arc<KvRouter>>> and using get_or_try_init for each key.

To detect any duplicates in a live system, grep your logs for multiple kv_create (or similar identifiers) for the same model_name within a short window.

@paulhendricks paulhendricks merged commit 8e4d81f into main Aug 25, 2025
11 checks passed
@paulhendricks paulhendricks deleted the phendricks/model-manager-parking-lot-mutex branch August 25, 2025 19:56
hhzhang16 pushed a commit that referenced this pull request Aug 27, 2025
…ng_lot::Mutex` (#2696)

Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
nv-anants pushed a commit that referenced this pull request Aug 28, 2025
jasonqinzhou pushed a commit that referenced this pull request Aug 30, 2025
…ng_lot::Mutex` (#2696)

Signed-off-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com>
KrishnanPrash pushed a commit that referenced this pull request Sep 2, 2025
…ng_lot::Mutex` (#2696)

Signed-off-by: Krishnan Prashanth <kprashanth@nvidia.com>
nnshah1 pushed a commit that referenced this pull request Sep 8, 2025
…ng_lot::Mutex` (#2696)

Signed-off-by: nnshah1 <neelays@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants