Skip to content

Conversation

@rahul-tuli
Copy link
Collaborator

Description taken from #180

CompressedTensorsHfQuantizerattempts to useapply_quantization_config` to apply the quantization config to the model

def _process_model_before_weight_loading(self, model, **kwargs):
        from compressed_tensors.quantization import apply_quantization_config

        ct_quantization_config = self.compressor.quantization_config
        apply_quantization_config(model, ct_quantization_config, run_compressed=True)

However, self.compressor.quantization_config can be None in the case that only sparsity is present. This does not align with the function contract of apply_quantization_config. In such a case, an error is thrown.

def apply_quantization_config(
    model: Module, config: QuantizationConfig, run_compressed: bool = False
) -> Dict:

This PR builds on top of #180 and adds more bugfixes needed for the case when a None quantization config is passed in by the HfQuantizer

config = AutoConfig.from_pretrained(pretrained_model_name_or_path, **kwargs)
compression_config = getattr(config, COMPRESSION_CONFIG_NAME, None)
if compression_config is None:
compression_config = getattr(config, QUANTIZATION_CONFIG_NAME, None)
Copy link
Collaborator

@dsikka dsikka Oct 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:return: the processed QuantizationConfig, if the raw config is not None
"""
if config.kv_cache_scheme is not None:
if config is not None and config.kv_cache_scheme is not None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we use this function outside of apply_quantization_config? If not, we would never hit the None case?



def process_quantization_config(config: QuantizationConfig) -> QuantizationConfig:
def process_quantization_config(config: Optional[QuantizationConfig]) -> Optional[QuantizationConfig]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the only use of process_quantization_config is by apply_quantization_config.
What is the purpose of this change?

Base automatically changed from kylesayrs/bugfix-support-apply-none to main October 7, 2024 16:55
@rahul-tuli
Copy link
Collaborator Author

This was relevant to the reloading a compressed model case, as per discussion offline, this is not needed

@rahul-tuli rahul-tuli closed this Oct 8, 2024
@rahul-tuli rahul-tuli deleted the more-bugfixes branch January 23, 2025 14:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants