Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Composability with sparse and quantization compressors #948

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

rahul-tuli
Copy link
Collaborator

@rahul-tuli rahul-tuli commented Dec 2, 2024

This PR enables accomplishes the following:

  • Increases the sparsity threshold to 50%
  • Allow sparse + quantized compression-decompression on llm-compressor side
  • Adds a test for sparse+quantized compression-decompression

Copy link

github-actions bot commented Dec 2, 2024

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

@@ -37,7 +36,8 @@ def infer_quantization_format(
if save_compressed:
weight_args, input_args = _get_unique_quant_args(model)
is_24_structure = (
sparsity_config and sparsity_config.sparsity_structure == "2:4"
SparsityStructure(sparsity_structure).value
== SparsityStructure.TWO_FOUR.value
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we've only enabled this to save using the marlin-24 compressor if the model follows 2:4 sparsity?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we not compare the enums directly without .value?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kylesayrs accepted

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dsikka let's sync offline

@rahul-tuli rahul-tuli force-pushed the composability-v2 branch 2 times, most recently from 9249158 to afc0b5f Compare December 3, 2024 06:16
@rahul-tuli rahul-tuli changed the title [ DRAFT ] Composability with sparse and quantization compressors Composability with sparse and quantization compressors Dec 3, 2024
horheynm
horheynm previously approved these changes Dec 3, 2024
@horheynm
Copy link
Collaborator

horheynm commented Dec 3, 2024

verified decompression works for sparse and quantized model

@rahul-tuli rahul-tuli force-pushed the composability-v2 branch 2 times, most recently from e8373c8 to 8c3b515 Compare December 23, 2024 18:36
Increase Sparsity Threshold

Signed-off-by: Rahul Tuli <rahul@neuralmagic.com>
Signed-off-by: Rahul Tuli <rahul@neuralmagic.com>
Signed-off-by: Rahul Tuli <rahul@neuralmagic.com>
Signed-off-by: Rahul Tuli <rahul@neuralmagic.com>
Signed-off-by: Rahul Tuli <rahul@neuralmagic.com>
Signed-off-by: Rahul Tuli <rahul@neuralmagic.com>
Signed-off-by: Rahul Tuli <rahul@neuralmagic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants