-
Notifications
You must be signed in to change notification settings - Fork 56
Support Flux quantization #850
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@@ -0,0 +1,3 @@ | |||
diffusers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don’t we need to add these requirements to our repo’s requirements file? If not, we should provide an auto-round[diffusion] installation instead, right?. please check with suyue/xuhao
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If those deps are not force required by auto_round, we can raise issue when user run into those code and ask them to install.
please reminder users that this is experimental feature and only validated on limited models ,e.g.xxx |
--prompt_file "captions_source.tsv" @n1ck-guo @yiliu30 As there are so many args in main.py, we need a better way for auto-round -h |
--prompt_file and --metrics are for evaluation. It is hard to wrapper prompt_file because there is no standard dataset in huggingface and other repos usually ask user to prepare dataset by themselves https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/examples/diffusers/README.md#data-format |
Signed-off-by: Mengni Wang <mengni.wang@intel.com>
Signed-off-by: Mengni Wang <mengni.wang@intel.com>
for more information, see https://pre-commit.ci
Signed-off-by: Mengni Wang <mengni.wang@intel.com>
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Signed-off-by: Mengni Wang <mengni.wang@intel.com>
Signed-off-by: Mengni Wang <mengni.wang@intel.com>
5f6621e
to
a146183
Compare
test/test_cuda/test_diffusion.py
Outdated
continue | ||
match = re.search(r"blocks\.(\d+)", n) | ||
if match and int(match.group(1)) > 0: | ||
layer_config[n] = {"bits": 16, "act_bits": 16, "data_type": "float", "act_data_type": "float"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WeiweiZhang1 I forgot the reason why we need to set the data_type, could we fix this issue, I thinks bits=16 is enough for fp layers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated, bits and act_bits are needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WeiweiZhang1 Could you help fix the act_bits issue later? It shouldn’t need to be set manually.
for more information, see https://pre-commit.ci
@wenhuach21 CI has passed, can we merge this PR? |
Uh oh!
There was an error while loading. Please reload this page.