vllm-project / llm-compressor Public

Notifications You must be signed in to change notification settings
Fork 306
Star 2.3k

Code
Issues 78
Pull requests 45
Discussions
Actions
Projects
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Pull requests: vllm-project/llm-compressor

Labels 22 Milestones 0

New pull request New

45 Open 944 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[BugFix] Fix Llama4 Calibration

#2101 opened Dec 6, 2025 by dsikka • Draft

add kv quant example

#2100 opened Dec 5, 2025 by mengniwang95

Loading…

[Utils] Delete deprecated utilities

#2098 opened Dec 5, 2025 by kylesayrs • Draft

[Utils] Deprecate unused utils ready

When a PR is ready for review

#2097 opened Dec 5, 2025 by kylesayrs

Loading…

Remove replace_module_for_calibration ready

When a PR is ready for review

#2095 opened Dec 4, 2025 by dsikka

Loading…

Linearize gpt_oss model and add separate example to qunatize it to w4a8

#2091 opened Dec 3, 2025 by isharif168

Loading…

feat: add importance-aware mixed-precision quantization

#2083 opened Dec 2, 2025 by wangwenmingaa

Loading…

[test] add e2e test for qwen3 moe w4a16 ready

When a PR is ready for review

#2071 opened Nov 25, 2025 by HDCharles • Draft

[AWQ] use match_modules_set and fix logic awq

For any issue / PR related to AWQ support

ready

When a PR is ready for review

#2070 opened Nov 25, 2025 by HDCharles

Loading…

[Performance] Batched calibration ready

When a PR is ready for review

#2054 opened Nov 20, 2025 by kylesayrs

Loading…

[Misc] Remove is_moe_model ready

When a PR is ready for review

#2053 opened Nov 20, 2025 by kylesayrs

Loading…

Testing Clean-up

#2045 opened Nov 18, 2025 by dsikka • Draft

Modernize transformers module with type hints and generic types

#2034 opened Nov 14, 2025 by sugatmahanti

Loading…

Support wInt4aFp8 for moe

#2027 opened Nov 12, 2025 by Wangzheee

Loading…

[Sequential Onloading] Support onloading and offloading frozen dataclasses

#2016 opened Nov 10, 2025 by kylesayrs

Loading…

[TypeHint] Fix format_calibration_data type hint

#2012 opened Nov 10, 2025 by kylesayrs

Loading…

Implement propagate_error argument ready

When a PR is ready for review

#2008 opened Nov 10, 2025 by kylesayrs

Loading…

Granite4 FP8 Block Quantization

#2001 opened Nov 6, 2025 by krishnateja95

Loading…

[model_free_ptq] NVFP4A16 ready

When a PR is ready for review

#1988 opened Nov 3, 2025 by kylesayrs

Loading…

[Kimi Linear] FP8 Example

#1986 opened Oct 31, 2025 by dsikka • Draft

[AWQ] Allow users to disable quantization during AWQ

#1973 opened Oct 28, 2025 by brian-dellabetta • Draft

[AWQ] Generalize AWQ quantization

#1961 opened Oct 22, 2025 by kylesayrs • Draft

2 of 3 tasks

[Oneshot] Add validation for empty dataset and enhance oneshot function parameters

#1957 opened Oct 21, 2025 by ArkaSanka

Loading…

[Autowrapper] Trace vision tower for better offloading

#1948 opened Oct 18, 2025 by kylesayrs • Draft

[Observers] Change MSE global scale objective function

#1935 opened Oct 14, 2025 by kylesayrs • Draft

Previous 1 2 Next

Previous Next

ProTip! Type g i on any issue or pull request to go back to the issue listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!