Refactor the quantization config for weights #399

thincal · 2024-04-09T08:31:07Z

Feature request

Currently the quantization info is hacked into the weights in many ways, result in not easily adding more features with quantization:

using a special function _set/get_gptq_params, but which is already applied for awq and eetq, this is very confusing for maintaining the code
to support multi quantize version for awq, this version info is missed in current implementation
to load a quantized weights for inference such eetq method, there is no elegant way to know the weights are whether quantized or not.

So that we could simply binding the config info with weights, and inside weights getting the desired info directly.

Motivation

A refactor for supporting quantization fetures.

Your contribution

Yes, I am prepared a PR.

The text was updated successfully, but these errors were encountered:

This was referenced Apr 9, 2024

refactor: set config into weights for quantization feature support more easily #400

Merged

feat: support loading eetq quantized model #393

Draft

tgaddair closed this as completed in #400 Apr 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor the quantization config for weights #399

Refactor the quantization config for weights #399

thincal commented Apr 9, 2024

Refactor the quantization config for weights #399

Refactor the quantization config for weights #399

Comments

thincal commented Apr 9, 2024

Feature request

Motivation

Your contribution