Composability with sparse and quantization compressors #948

rahul-tuli · 2024-12-02T22:16:44Z

This PR enables accomplishes the following:

Increases the sparsity threshold to 50%
Allow sparse + quantized compression-decompression on llm-compressor side
Adds a test for sparse+quantized compression-decompression

github-actions · 2024-12-02T22:16:55Z

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

dsikka · 2024-12-02T22:43:23Z

src/llmcompressor/transformers/compression/quantization_format.py

@@ -37,7 +36,8 @@ def infer_quantization_format(
    if save_compressed:
        weight_args, input_args = _get_unique_quant_args(model)
        is_24_structure = (
-            sparsity_config and sparsity_config.sparsity_structure == "2:4"
+            SparsityStructure(sparsity_structure).value
+            == SparsityStructure.TWO_FOUR.value


It seems like we've only enabled this to save using the marlin-24 compressor if the model follows 2:4 sparsity?

nit: can we not compare the enums directly without .value?

@kylesayrs accepted

@dsikka let's sync offline

horheynm · 2024-12-03T16:42:47Z

verified decompression works for sparse and quantized model

Increase Sparsity Threshold Signed-off-by: Rahul Tuli <rahul@neuralmagic.com>

Signed-off-by: Rahul Tuli <rahul@neuralmagic.com>

dsikka reviewed Dec 2, 2024

View reviewed changes

rahul-tuli force-pushed the composability-v2 branch 2 times, most recently from 9249158 to afc0b5f Compare December 3, 2024 06:16

rahul-tuli changed the title ~~[ DRAFT ] Composability with sparse and quantization compressors~~ Composability with sparse and quantization compressors Dec 3, 2024

horheynm previously approved these changes Dec 3, 2024

View reviewed changes

rahul-tuli dismissed horheynm’s stale review via 480247c December 20, 2024 16:15

rahul-tuli force-pushed the composability-v2 branch 2 times, most recently from e8373c8 to 8c3b515 Compare December 23, 2024 18:36

rahul-tuli added 7 commits December 23, 2024 18:36

Enable Sparse24 quantization for Weight + Activation quantization

fb9c975

Increase Sparsity Threshold Signed-off-by: Rahul Tuli <rahul@neuralmagic.com>

Add composability test

5b994c0

Signed-off-by: Rahul Tuli <rahul@neuralmagic.com>

Review comments from @kylesayrs compare enum directly

03488d4

Signed-off-by: Rahul Tuli <rahul@neuralmagic.com>

Bitmask test

5eef2d3

Signed-off-by: Rahul Tuli <rahul@neuralmagic.com>

Enable sparse24bytemask compressor

face5e2

Signed-off-by: Rahul Tuli <rahul@neuralmagic.com>

Update: SparseBitMaskCompressor

ccd6c39

Signed-off-by: Rahul Tuli <rahul@neuralmagic.com>

Tests

4043b65

Signed-off-by: Rahul Tuli <rahul@neuralmagic.com>

rahul-tuli force-pushed the composability-v2 branch from 8c3b515 to 4043b65 Compare December 23, 2024 18:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Composability with sparse and quantization compressors #948

Composability with sparse and quantization compressors #948

rahul-tuli commented Dec 2, 2024 •

edited

Loading

github-actions bot commented Dec 2, 2024

dsikka Dec 2, 2024

kylesayrs Dec 3, 2024

rahul-tuli Dec 3, 2024

rahul-tuli Dec 3, 2024

horheynm commented Dec 3, 2024

Composability with sparse and quantization compressors #948

Are you sure you want to change the base?

Composability with sparse and quantization compressors #948

Conversation

rahul-tuli commented Dec 2, 2024 • edited Loading

github-actions bot commented Dec 2, 2024

dsikka Dec 2, 2024

Choose a reason for hiding this comment

kylesayrs Dec 3, 2024

Choose a reason for hiding this comment

rahul-tuli Dec 3, 2024

Choose a reason for hiding this comment

rahul-tuli Dec 3, 2024

Choose a reason for hiding this comment

horheynm commented Dec 3, 2024

rahul-tuli commented Dec 2, 2024 •

edited

Loading