You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**Issue**: Eagle draft model inheriting target model's MxFP4 quantization causing "Mxfp4 linear layer is not implemented" error
448
+
**Root Cause**: After upstream merge, stricter quantization validation caught bug where Eagle draft model incorrectly inherited target model's quantization config instead of using its own
449
+
**Solution**: Create separate quantization config for draft model using clone + override pattern:
450
+
- Use `copy.deepcopy(vllm_config)` to clone target config
451
+
- Override `quant_config` with draft model's quantization settings
452
+
- Add robust attribute checking with graceful fallbacks for missing quant_config
453
+
- Handles cases where draft model or VllmConfig lacks quantization attributes
0 commit comments