Skip to content

Commit fcea47f

Browse files
committed
Fix Bug #5: Add debug logging for smoothing check
Add diagnostic logging to verify draft_mix_lambda_max value and whether smoothing will execute. This will help diagnose if smoothing is running (which prevents q from becoming exactly 1.0 in corner cases). Expected log output: [SMOOTH_DEBUG] lambda_max from config: 0.02, will run smoothing: True If we see 'will run smoothing: False', smoothing isn't applying and q can still collapse to 1.0 in ultracold regimes.
1 parent a38f70d commit fcea47f

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

vllm/v1/spec_decode/eagle.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -284,6 +284,8 @@ def _sample_draft_tokens(
284284

285285
# --- tiny smoothing over kept set (prevents q==1.0 in ultracold corners) ---
286286
lam = float(getattr(self.opt_config, "draft_mix_lambda_max", 0.0))
287+
print(f"[SMOOTH_DEBUG] lambda_max from config: {lam}, will run smoothing: {lam > 0.0}",
288+
file=sys.stderr, flush=True)
287289
if lam > 0.0:
288290
K = keep.sum(dim=-1, keepdim=True).clamp_min(1)
289291
uniform = keep.to(x.dtype) / K

0 commit comments

Comments
 (0)