Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable Default Bitmask Compression #60

Merged
merged 2 commits into from
Aug 6, 2024
Merged

Disable Default Bitmask Compression #60

merged 2 commits into from
Aug 6, 2024

Conversation

Satrat
Copy link
Contributor

@Satrat Satrat commented Aug 6, 2024

SUMMARY:
This issue came up a few days ago where a user tried to run a sparsified model in vLLM: #45. We compress sparse models into the "sparse-bitmask" compression format by default on save, however this format isn't yet supported in vLLM. I'm updating the save logic to disable automatic sparse compression for now, we can re-enable once this is supported in vLLM

TEST PLAN:
Manual test to confirm sparse models are saved as dense by default:

import torch
from llmcompressor.modifiers.obcq import SparseGPTModifier
from llmcompressor.transformers import SparseAutoModelForCausalLM, oneshot

recipe = SparseGPTModifier(sparsity=0.5)

model_stub = "TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T"
model = SparseAutoModelForCausalLM.from_pretrained(model_stub, torch_dtype=torch.float16, device_map="auto")

dataset = "ultrachat-200k"

output_dir = "./test_output_sparse"

splits = {"calibration": "train_gen[:5%]"}
max_seq_length = 512
pad_to_max_length = False
num_calibration_samples = 32

oneshot(
    model=model,
    dataset=dataset,
    recipe=recipe,
    output_dir=output_dir,
    splits=splits,
    max_seq_length=max_seq_length,
    pad_to_max_length=pad_to_max_length,
    num_calibration_samples=num_calibration_samples,
)

Output config.json shows the expected dense format:

  "compression_config": {
    "sparsity_config": {
      "format": "dense",
      "global_sparsity": 0.44059220580610386,
      "registry_requires_subclass": false,
      "sparsity_structure": "0:0"
    }
  },

@Satrat Satrat merged commit d10b79e into main Aug 6, 2024
8 of 12 checks passed
@Satrat Satrat deleted the sa/dense_sparsity branch August 6, 2024 16:37
markmc pushed a commit to markmc/llm-compressor that referenced this pull request Nov 13, 2024
* fix group size min max tracking by adding tensor ids

* propagate change to  in base

* bug

* lint

* add back reduce_dims

* fix

* fix

* comment

---------

Co-authored-by: George Ohashi <george@neuralmagic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants