You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Set gpu-memory-utilization to 0.85
- Enable chunked-prefill
- Disable prefix-caching (better perf when used alone)
- Validated via grid search (Mission 14k)
Performance gain: +222% vs baseline (x3.22 vs x1.59)
Container healthy in 324s
Configuration stable and production-ready
Refs: Mission 15
0 commit comments