Support Flux quantization #850

mengniwang95 · 2025-09-25T02:43:54Z

for model_name in "/mengni/flux/"
do
  CUDA_VISIBLE_DEVICES="7"  python3 -m auto_round \
    --scheme MXFP4
    --format "llm_compressor" \
    --output_dir "/mengni/tmp_flux" \
    --prompt_file "captions_source.tsv" \ # download from https://github.com/mlcommons/inference/raw/refs/heads/master/text_to_image/coco2014/captions/captions_source.tsv
    --metrics "clip,clip-iqa,imagereward" \
    --dataset coco2014
done

auto_round/autoround.py

auto_round/compressors/base.py

auto_round/__main__.py

auto_round/compressors/base.py

auto_round/compressors/diffusion/dataset.py

auto_round/utils.py

wenhuach21 · 2025-09-25T03:15:56Z

test/test_cuda/requirements_diffusion.txt

@@ -0,0 +1,3 @@
+diffusers


Don’t we need to add these requirements to our repo’s requirements file? If not, we should provide an auto-round[diffusion] installation instead, right?. please check with suyue/xuhao

If those deps are not force required by auto_round, we can raise issue when user run into those code and ask them to install.

wenhuach21 · 2025-09-26T07:25:23Z

please reminder users that this is experimental feature and only validated on limited models ,e.g.xxx

wenhuach21 · 2025-09-26T07:26:58Z

--prompt_file "captions_source.tsv"
--metrics "clip,clip-iqa,imagereward"
1 is this for evaluation? It's hard for user to understand, please wrapper it as tasks
2 please follow the original behavior, if tasks is not set, no evaluation is conducted

@n1ck-guo @yiliu30 As there are so many args in main.py, we need a better way for auto-round -h

mengniwang95 · 2025-09-26T07:43:13Z

--prompt_file "captions_source.tsv" --metrics "clip,clip-iqa,imagereward" 1 is this for evaluation? It's hard for user to understand, please wrapper it as tasks 2 please follow the original behavior, if tasks is not set, no evaluation is conducted

@n1ck-guo @yiliu30 As there are so many args in main.py, we need a better way for auto-round -h

--prompt_file and --metrics are for evaluation. It is hard to wrapper prompt_file because there is no standard dataset in huggingface and other repos usually ask user to prepare dataset by themselves https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/examples/diffusers/README.md#data-format

Signed-off-by: Mengni Wang <mengni.wang@intel.com>

for more information, see https://pre-commit.ci

Signed-off-by: Mengni Wang <mengni.wang@intel.com>

for more information, see https://pre-commit.ci

Signed-off-by: Mengni Wang <mengni.wang@intel.com>

for more information, see https://pre-commit.ci

Signed-off-by: Mengni Wang <mengni.wang@intel.com>

test/test_cuda/test_diffusion.py

wenhuach21 · 2025-10-10T03:34:34Z

test/test_cuda/test_diffusion.py

+                continue
+            match = re.search(r"blocks\.(\d+)", n)
+            if match and int(match.group(1)) > 0:
+                layer_config[n] = {"bits": 16, "act_bits": 16, "data_type": "float", "act_data_type": "float"}


@WeiweiZhang1 I forgot the reason why we need to set the data_type, could we fix this issue, I thinks bits=16 is enough for fp layers

updated, bits and act_bits are needed

@WeiweiZhang1 Could you help fix the act_bits issue later? It shouldn’t need to be set manually.

auto_round/utils.py

for more information, see https://pre-commit.ci

mengniwang95 · 2025-10-10T13:00:30Z

@wenhuach21 CI has passed, can we merge this PR?