Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor the quantization config for weights #399

Closed
thincal opened this issue Apr 9, 2024 · 0 comments · Fixed by #400
Closed

Refactor the quantization config for weights #399

thincal opened this issue Apr 9, 2024 · 0 comments · Fixed by #400

Comments

@thincal
Copy link
Contributor

thincal commented Apr 9, 2024

Feature request

Currently the quantization info is hacked into the weights in many ways, result in not easily adding more features with quantization:

  1. using a special function _set/get_gptq_params, but which is already applied for awq and eetq, this is very confusing for maintaining the code
  2. to support multi quantize version for awq, this version info is missed in current implementation
  3. to load a quantized weights for inference such eetq method, there is no elegant way to know the weights are whether quantized or not.

So that we could simply binding the config info with weights, and inside weights getting the desired info directly.

Motivation

A refactor for supporting quantization fetures.

Your contribution

Yes, I am prepared a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant