Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate quanto commands to optimum-cli #244

Merged
merged 4 commits into from
Jul 17, 2024
Merged

Integrate quanto commands to optimum-cli #244

merged 4 commits into from
Jul 17, 2024

Conversation

dacorvo
Copy link
Collaborator

@dacorvo dacorvo commented Jul 17, 2024

What does this PR do?

This adds a subpackage module that registers quanto commands to optimum CLI.

The subpackage contains for now only a very simple quantize command.

$ optimum-cli -h
usage: optimum-cli

positional arguments:
  {export,env,quanto}
    export             Export PyTorch and TensorFlow models to several format.
    env                Get information about the environment used.
    quanto             Hugging Face models quantization tools

optional arguments:
  -h, --help           show this help message and exit
$ optimum-cli quanto -h
usage: optimum-cli quanto [-h] {quantize} ...

positional arguments:
  {quantize}
    quantize  Quantize Hugging Face models.

optional arguments:
  -h, --help  show this help message and exit
$ optimum-cli quanto quantize -h
usage: optimum-cli quanto quantize [-h] -m MODEL [--weights {int2,int4,int8,float8}] [--revision REVISION] [--trust_remote_code] [--library {transformers}] [--task TASK]
                                   [--torch_dtype {auto,fp16,bf16}] [--device DEVICE]
                                   output

optional arguments:
  -h, --help            show this help message and exit

Required arguments:
  output                The path to save the quantized model.
  -m MODEL, --model MODEL
                        Hugging Face Hub model id or path to a local model.
  --weights {int2,int4,int8,float8}
                        The Hugging Face library to use to load the model.

Optional arguments:
  --revision REVISION   The Hugging Face model revision.
  --trust_remote_code   Trust remote code when loading the model.
  --library {transformers}
                        The Hugging Face library to use to load the model.
  --task TASK           The model task (useful for models supporting multiple tasks).
  --torch_dtype {auto,fp16,bf16}
                        The torch dtype to use when loading the model weights.
  --device DEVICE       The device to use when loading the model.

@dacorvo dacorvo force-pushed the optimum-subpackage branch 2 times, most recently from 225bc1e to 8fabcc2 Compare July 17, 2024 14:23
@dacorvo dacorvo merged commit 8064266 into main Jul 17, 2024
12 checks passed
@dacorvo dacorvo deleted the optimum-subpackage branch July 17, 2024 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant