[Temporary] Add compressed-tensors HFQuantizer implementation #101

bfineran · 2024-06-05T20:04:55Z

for internal review

Use on compressed-tensor branch
neuralmagic/compressed-tensors#79

For the tests to pass, the quantized model (base model applying config) needs to have scale and zp.

src/transformers/quantizers/auto.py

src/transformers/utils/quantization_config.py

tests/quantization/compressed_tensor/test_compressed_tensors.py

* support-qwen2-vl * tidy * tidy * tidy * tidy * tidy * tidy * tidy * hyphen->underscore * make style * add-flash2-tipd * delete-tokenize=False * remove-image_processor-in-init-file * add-qwen2_vl-in-MODEL_FOR_VISION_2_SEQ_MAPPING_NAMES * format-doct * support-Qwen2VLVisionConfig * remove-standardize_cache_format * fix-letter-varaibles * remove-torch-in-image-processor * remove-useless-docstring * fix-one-letter-varaible-name * change-block-name * default-quick-gelu-in-vision * remove-useless-doc * use-preimplemented-flash-forward * fix-doc * fix-image-processing-doc * fix-apply-rotary-embed * fix-flash-attn-sliding-window * refactor * remove-default_template * remove-reorder_cache * simple-get-rope_deltas * update-prepare_inputs_for_generation * update-attention-mask * update-rotary_seq_len * remove-state * kv_seq_length * remove-warning * _supports_static_cache * remove-legacy-cache * refactor * fix-replace * mrope-section-doc * code-quality * code-quality * polish-doc * fix-image-processing-test * update readme * Update qwen2_vl.md * fix-test * Update qwen2_vl.md * nit * processor-kwargs * hard-code-norm_layer * code-quality * discard-pixel-values-in-gen * fix-inconsistent-error-msg * unify-image-video * hidden_act * add-docstring * vision-encode-as-PreTrainedModel * pixel-to-target-dtype * update doc and low memoryvit * format * format * channel-foramt * fix vit_flashatt * format * inherit-Qwen2VLPreTrainedModel * simplify * format-test * remove-one-line-func-in-image-processing * avoid-one-line-reshape * simplify-rotary_seq_len * avoid-single-letter-variable * no-for-loop-sdpa * avoid-single-letter-variable * remove-one-line-reshape * remove-one-line-reshape * remove-no-rope-in-vit-logic * default-mrope * add-copied-from * more-docs-for-mrope * polish-doc * comment-and-link * polish-doc * single-letter-variables * simplify-image-processing * video->images * kv_seq_len-update * vision-rope-on-the-fly * vision-eager-attention * change-processor-order --------- Co-authored-by: baishuai <baishuai.bs@alibaba-inc.com> Co-authored-by: ShuaiBai623 <43326198+ShuaiBai623@users.noreply.github.com>

…ggingface#33093)

…ggingface#33099)

…gface#32404) * Add changes for uroman package to handle non-Roman characters * Update docs for uroman changes * Modifying error message to warning, for backward compatibility * Update instruction for user to install uroman * Update docs for uroman python version dependency and backward compatibility * Update warning message for python version compatibility with uroman * Refine docs

…atible with DeepSpeed (huggingface#33105) Fixed pydantic required version in dockerfiles.

* fix documentation * update config

Fixup py 38

…gingface#32850) * Fixed failing CodeGenTokenizationTest::test_truncation. * [run_slow] Codegen * [run_slow] codegen

…e#32079) * fix: multilingual midel convert to tflite get wrong token * fix: modify test_force_tokens_logits_processor the checking value as scores.dtype.min --------- Co-authored-by: kent.sc.hung <kent.sc.hung@benq.com> Co-authored-by: Aya <[kent831217@gmail.com]>

disable scheduled daily CI temporary Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

…sues due to large image size (huggingface#33123) * fix param not being passed in tested; add exceptions * better source of model name * Update utils/create_dummy_models.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

…ojects/hybrid_clip (huggingface#33137) Bump torch in /examples/research_projects/jax-projects/hybrid_clip Bumps [torch](https://github.com/pytorch/pytorch) from 1.13.1 to 2.2.0. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md) - [Commits](pytorch/pytorch@v1.13.1...v2.2.0) --- updated-dependencies: - dependency-name: torch dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Log additional test metrics with the CometCallback. Also follow the same metric naming convention as other callbacks * Merge 2 subsequent if-statements * Trigger Build --------- Co-authored-by: Aliaksandr Kuzmik <alexander.kuzmik99@gmail.com>

* [docs] add quick usage snippet to Whisper. * Apply suggestions from review. * 💉 Fix the device for pipeline.

…#32115) * update ExportableState callbacks state before saving trainer_state on save_checkpoint * run make fixup and fix format * manage multiple stateful callbacks of same class

* fix Idefics2VisionConfig type annotation * Update modeling_idefics2.py * Update modeling_idefics2.py add ignore copy * Update modeling_idefics2.py * Update modeling_idefics2.py

* Add a fix for the case when tokenizers are passed as a string * Support image processors and feature extractors as well * Reverting load_feature_extractor and load_image_processor * Add test * Test is torch-only * Add tests for preprocessors and feature extractors and move test * Extremely experimental fix * Revert that change, wrong branch! * Typo! * Split tests

…33131) * fix redundant checkpointing in example scripts * Update examples/pytorch/image-classification/run_image_classification_no_trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update examples/pytorch/translation/run_translation_no_trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update examples/pytorch/token-classification/run_ner_no_trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update examples/pytorch/text-classification/run_glue_no_trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update examples/pytorch/summarization/run_summarization_no_trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update examples/pytorch/semantic-segmentation/run_semantic_segmentation_no_trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update examples/pytorch/language-modeling/run_mlm_no_trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update examples/pytorch/language-modeling/run_fim_no_trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update examples/pytorch/language-modeling/run_clm_no_trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update examples/pytorch/image-pretraining/run_mim_no_trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update examples/pytorch/instance-segmentation/run_instance_segmentation_no_trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update examples/pytorch/multiple-choice/run_swag_no_trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update examples/pytorch/question-answering/run_qa_no_trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update examples/pytorch/object-detection/run_object_detection_no_trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update examples/pytorch/question-answering/run_qa_beam_search_no_trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

update req

* docs: ko: conversations.md * feat: hand-crafted translate docs * fix: modify typo after Grammar Check * Update docs/source/ko/conversations.md 감사합니다 Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> * Update docs/source/ko/conversations.md Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> * Update docs/source/ko/conversations.md Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> * Update docs/source/ko/conversations.md Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> * Update docs/source/ko/conversations.md Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> * Update docs/source/ko/conversations.md Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> * Update docs/source/ko/conversations.md Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> * Update docs/source/ko/conversations.md Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> * Update docs/source/ko/conversations.md Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> * Update docs/source/ko/conversations.md Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> * Update docs/source/ko/conversations.md Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> * fix: accept suggestions about anchor and spacing * Update docs/source/ko/conversations.md Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com> * Update docs/source/ko/conversations.md Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com> * Update docs/source/ko/conversations.md Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com> * Update docs/source/ko/conversations.md Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com> * Update docs/source/ko/conversations.md Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com> * Update docs/source/ko/conversations.md Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com> * Update docs/source/ko/conversations.md Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> * Update docs/source/ko/conversations.md Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> * Update docs/source/ko/conversations.md Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> * fix: anchor 'what happened inside piepeline?' be removed question mark * fix: translate the comments in the code block --------- Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com> Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>

Very small change to one of the parameters np.random.randint second parameter is not included in the possible options. Therefore, we want the upper range to be 2, so that we have some 1 labels in our classification as well.

…er_only` (huggingface#33602) almost zero is not zero

Remove model tests

* Add sdpa for BioGpt * Updates * Add the docs * [run_slow] biogpt * Use the copy mechanism to ensure consistency * [run_slow] biogpt

)

fix missing tests Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

…ggingface#33479) * add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin * change size and crop_size in processor kwargs tests to do_rescale and rescale_factor * remove unnecessary llava processor kwargs test overwrite * nit * change data_arg_name to input_name * Remove unnecessary test override * Remove unnecessary tests Paligemma * Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring

…gface#33507) * fix: handle padding in contrastive search for decoder-only models * fix: handle padding in contrastive search for encoder-decoder models * tests: move padding contrastive test to test_util, add t5 test * fix: handle if model_kwargs["decoder_attention_mask"] is None * refactor: improve padding input contrastive search generation tests * chore: _ranking_fast to use LongTensor for cosine_matrix_mask

…gface#33534)

* fix * fix * fix * fix * skip * skip more --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* update * re-enable daily CI --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* fix qwen2vl float16 inference bug * [run-slow] qwen2_vl

Co-authored-by: litianjian <litianjian@bytedance.com>

* enable low-precision pipeline * fix parameter for ASR * reformat * fix asr bug * fix bug for zero-shot * add dtype check * rm useless comments * add np.float16 check * Update src/transformers/pipelines/image_classification.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/pipelines/token_classification.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * fix comments * fix asr check * make fixup * No more need for is_torch_available() --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> Co-authored-by: Matt <rocketknight1@gmail.com>

* first commit * drop tokenizer * drop tokenizer * drop tokenizer * drop convert * granite * drop tokenization test * mup * fix * reformat * reformat * reformat * fix docs * stop checking for checkpoint * update support * attention multiplier * update model * tiny drop * saibo drop * skip test * fix test * fix test * drop * drop useless imports * update docs * drop flash function * copied from * drop pretraining tp * drop pretraining tp * drop pretraining tp * drop unused import * drop code path * change name * softmax scale * head dim * drop legacy cache * rename params * cleanup * fix copies * comments * add back legacy cache * multipliers * multipliers * multipliers * text fix * fix copies * merge * multipliers * attention multiplier * drop unused imports * add granitemoe * add decoration * remove moe from sequenceclassification * fix test * fix * fix * fix * move rope? * merge * drop bias * drop bias * Update src/transformers/models/granite/configuration_granite.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix * Update src/transformers/models/granite/modeling_granite.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix * fix * fix * fix * drop * drop * fix * fix * cleanup * cleanup * fix * fix granite tests * fp32 test * fix * drop jitter * fix * rename * rename * fix config * add gen test --------- Co-authored-by: Yikang Shen <yikang.shn@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update pixtral example checkpoint * Fix typo

* add sdpa to dinov2 * fixup * add dinov2 to sdpa doc * update doc order * [run-slow] dinov2 * common to eager * [run-slow] dinov2 * update attn implementation in common * update test_modeling_dinov2 to have mask_ration, num_masks and mask_length similar to vit * [run-slow] dinov2 --------- Co-authored-by: Avishai Elmakies <avishai.elma@cs.huji.ac.il>

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

clean up Unpack imports

* fallback to eager if output attentions. * fix copies

* handle dependency errors in check_imports * change log level to warning

huggingface#33550) * add back self.max_position_embeddings = config.max_position_embeddings * fix-copies

…huggingface#33613) fix llavaqwen2 model conversion

* Add optional kwargs and uniformize udop * cleanup Unpack * nit Udop

…xin` (huggingface#33203)

* enable cpu bnb path * fix style * fix code style * fix 4 bit path * Update src/transformers/utils/import_utils.py Co-authored-by: Aarni Koskela <akx@iki.fi> * add multi backend refactor tests * fix style * tweak 4bit quantizer + fix corresponding tests * tweak 8bit quantizer + *try* fixing corresponding tests * fix dequant bnb 8bit * account for Intel CPU in variability of expected outputs * enable cpu and xpu device map * further tweaks to account for Intel CPU * fix autocast to work with both cpu + cuda * fix comments * fix comments * switch to testing_utils.torch_device * allow for xpu in multi-gpu tests * fix tests 4bit for CPU NF4 * fix bug with is_torch_xpu_available needing to be called as func * avoid issue where test reports attr err due to other failure * fix formatting * fix typo from resolving of merge conflict * polish based on last PR review Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * fix CI * Update src/transformers/integrations/integration_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/integrations/integration_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix error log * fix error msg * add \n in error log * make quality * rm bnb cuda restriction in doc * cpu model don't need dispatch * fix doc * fix style * check cuda avaliable in testing * fix tests * Update docs/source/en/model_doc/chameleon.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update docs/source/en/model_doc/llava_next.md Co-authored-by: Aarni Koskela <akx@iki.fi> * Update tests/quantization/bnb/test_4bit.py Co-authored-by: Aarni Koskela <akx@iki.fi> * Update tests/quantization/bnb/test_4bit.py Co-authored-by: Aarni Koskela <akx@iki.fi> * fix doc * fix check multibackends * fix import sort * remove check torch in bnb * docs: update bitsandbytes references with multi-backend info * docs: fix small mistakes in bnb paragraph * run formatting * reveret bnb check * move bnb multi-backend check to import_utils * Update src/transformers/utils/import_utils.py Co-authored-by: Aarni Koskela <akx@iki.fi> * fix bnb check * minor fix for bnb * check lib first * fix code style * Revert "run formatting" This reverts commit ac108c6. * fix format * give warning when bnb version is low and no cuda found] * fix device assignment check to be multi-device capable * address akx feedback on get_avlbl_dev fn * revert partially, as we don't want the function that public, as docs would be too much (enforced) --------- Co-authored-by: Aarni Koskela <akx@iki.fi> Co-authored-by: Titus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

…e#33652) * Fix error string after refactoring into get_chat_template * Take suggestion from CR Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* uniformize git processor * update doctring

bfineran requested a review from Satrat June 5, 2024 20:04

bfineran self-assigned this Jun 5, 2024

Satrat reviewed Jun 11, 2024

View reviewed changes

src/transformers/quantizers/auto.py Outdated Show resolved Hide resolved

Satrat reviewed Jun 11, 2024

View reviewed changes

src/transformers/utils/quantization_config.py Outdated Show resolved Hide resolved

Satrat reviewed Jun 11, 2024

View reviewed changes

src/transformers/utils/quantization_config.py Outdated Show resolved Hide resolved

horheynm reviewed Jun 13, 2024

View reviewed changes

tests/quantization/compressed_tensor/test_compressed_tensors.py Outdated Show resolved Hide resolved

horheynm marked this pull request as ready for review June 13, 2024 18:36

simonJJJ and others added 23 commits August 26, 2024 15:16

CI: add torchvision to the consistency image (huggingface#32941)

93e0e1a

Test: add higher atol in test_forward_with_num_logits_to_keep (hu…

894d421

…ggingface#33093)

mps: add isin_mps_friendly, a wrapper function for torch.isin (hu…

72d4a3f

…ggingface#33099)

fix: Fixed pydantic required version in dockerfiles to make it comp…

3562772

…atible with DeepSpeed (huggingface#33105) Fixed pydantic required version in dockerfiles.

quickfix documentation (huggingface#32566)

26f043b

* fix documentation * update config

Fixup py 38 type hints for mps friendly (huggingface#33128)

9578c25

Fixup py 38

fix: Fixed CodeGenTokenizationTest::test_truncation failing test (hug…

3bf6dd8

…gingface#32850) * Fixed failing CodeGenTokenizationTest::test_truncation. * [run_slow] Codegen * [run_slow] codegen

disable scheduled daily CI temporarily (huggingface#33136)

3806faa

disable scheduled daily CI temporary Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

[docs] add quick usage snippet to Whisper. (huggingface#31289)

6f0ecf1

* [docs] add quick usage snippet to Whisper. * Apply suggestions from review. * 💉 Fix the device for pipeline.

Update stateful_callbacks state before saving checkpoint (huggingface…

d1f39c4

…#32115) * update ExportableState callbacks state before saving trainer_state on save_checkpoint * run make fixup and fix format * manage multiple stateful callbacks of same class

fix Idefics2VisionConfig type annotation (huggingface#33103)

834ec7b

* fix Idefics2VisionConfig type annotation * Update modeling_idefics2.py * Update modeling_idefics2.py add ignore copy * Update modeling_idefics2.py * Update modeling_idefics2.py

Llama: make slow tests green 🟢 (huggingface#33138)

c6b23fd

update torch req for 4-bit optimizer (huggingface#33144)

7ee4363

update req

gante and others added 30 commits September 20, 2024 14:50

Generate: remove flakyness in `test_generate_from_inputs_embeds_decod…

266d0a6

…er_only` (huggingface#33602) almost zero is not zero

Remove unnecessary CPM model tests (huggingface#33621)

f9b4409

Remove model tests

Add sdpa for BioGpt (huggingface#33592)

653eb40

* Add sdpa for BioGpt * Updates * Add the docs * [run_slow] biogpt * Use the copy mechanism to ensure consistency * [run_slow] biogpt

VLM generate: tests can't generate image/video tokens (huggingface#33623

2fdb5e7

)

Fix missing test in torch_job (huggingface#33593)

31caf0b

fix missing tests Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Generate: assistant should sample when the main model samples (huggin…

77c5d59

…gface#33534)

Fix some missing tests in circleci (huggingface#33559)

077b552

* fix * fix * fix * fix * skip * skip more --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Update daily ci to use new cluster (huggingface#33627)

75c878d

* update * re-enable daily CI --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Fix qwen2vl float16 inference bug (huggingface#33312)

e9356a4

* fix qwen2vl float16 inference bug * [run-slow] qwen2_vl

Fix typos (huggingface#33583)

7b2b536

Co-authored-by: litianjian <litianjian@bytedance.com>

Pixtral update example checkpoint (huggingface#33633)

e71bf70

* Update pixtral example checkpoint * Fix typo

Update src/transformers/utils/quantization_config.py

3cb4415

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

Clean up Unpack imports (huggingface#33631)

9eb9385

clean up Unpack imports

Fix DPT /Dinov2 sdpa regression on main (huggingface#33660)

b7c381f

* fallback to eager if output attentions. * fix copies

handle dependency errors in check_imports (huggingface#33622)

6d02968

* handle dependency errors in check_imports * change log level to warning

add back self.max_position_embeddings = config.max_position_embeddings (

214db9e

huggingface#33550) * add back self.max_position_embeddings = config.max_position_embeddings * fix-copies

Fix Llava conversion for LlavaQwen2ForCausalLM with Clip vision tower (…

be9cf07

…huggingface#33613) fix llavaqwen2 model conversion

Uniformize kwargs for Udop processor and update docs (huggingface#33628)

1456120

* Add optional kwargs and uniformize udop * cleanup Unpack * nit Udop

Generation: deprecate PreTrainedModel inheriting from `GenerationMi…

e15687f

…xin` (huggingface#33203)

uniformize git processor (huggingface#33668)

75b7485

* uniformize git processor * update doctring

Merge branch 'main' into compressed-tensors-quantizer

a943157

update doc

64f475a

add note about saving a loaded model

fabe8a3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Temporary] Add compressed-tensors HFQuantizer implementation #101

[Temporary] Add compressed-tensors HFQuantizer implementation #101

bfineran commented Jun 5, 2024 •

edited by horheynm

Loading

[Temporary] Add compressed-tensors HFQuantizer implementation #101

Are you sure you want to change the base?

[Temporary] Add compressed-tensors HFQuantizer implementation #101

Conversation

bfineran commented Jun 5, 2024 • edited by horheynm Loading

bfineran commented Jun 5, 2024 •

edited by horheynm

Loading