-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More fixes needed when QuantizationConfig is None #184
Conversation
@@ -104,6 +104,8 @@ def from_pretrained( | |||
""" | |||
config = AutoConfig.from_pretrained(pretrained_model_name_or_path, **kwargs) | |||
compression_config = getattr(config, COMPRESSION_CONFIG_NAME, None) | |||
if compression_config is None: | |||
compression_config = getattr(config, QUANTIZATION_CONFIG_NAME, None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not required. See pathway used by HFQuantizer:
https://github.com/huggingface/transformers/blob/55be7c4c483a01a7e03e55a8756fc4385ec08ffc/src/transformers/quantizers/quantizer_compressed_tensors.py#L40
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a6dfdcc
to
ecc49fc
Compare
""" | ||
if config.kv_cache_scheme is not None: | ||
if config is not None and config.kv_cache_scheme is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we use this function outside of apply_quantization_config
? If not, we would never hit the None
case?
@@ -189,14 +190,14 @@ def apply_quantization_config( | |||
return names_to_scheme | |||
|
|||
|
|||
def process_quantization_config(config: QuantizationConfig) -> QuantizationConfig: | |||
def process_quantization_config(config: Optional[QuantizationConfig]) -> Optional[QuantizationConfig]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the only use of process_quantization_config
is by apply_quantization_config
.
What is the purpose of this change?
This was relevant to the reloading a compressed model case, as per discussion offline, this is not needed |
Description taken from #180
CompressedTensorsHfQuantizer
attempts to use
apply_quantization_config` to apply the quantization config to the modelHowever,
self.compressor.quantization_config
can beNone
in the case that only sparsity is present. This does not align with the function contract ofapply_quantization_config
. In such a case, an error is thrown.This PR builds on top of #180 and adds more bugfixes needed for the case when a None quantization config is passed in by the
HfQuantizer