Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat (mx): gptq compatibility and quant tests #1013

Merged
merged 13 commits into from
Sep 5, 2024

Conversation

Giuseppe5
Copy link
Collaborator

@Giuseppe5 Giuseppe5 commented Aug 28, 2024

Depends on #1011 and #1012

Added tests for MX and FloatQuant, with some restrictions;

  • Neither support external bias quantization at the moment
  • Compatibility with MHA is limited due to how the various quant options interact with each-other
  • MX quantizers need to be paired to account for the possibility of padding (to be revisited post-release)
  • MX quantizers are not JIT compatible

@Giuseppe5 Giuseppe5 changed the title Feat (mx): adding gptq and quant tests Feat (mx): gptq compatibility and quant tests Aug 31, 2024
@Giuseppe5 Giuseppe5 requested review from nickfraser and removed request for nickfraser August 31, 2024 11:46
Copy link
Collaborator

@nickfraser nickfraser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two small comments, otherwise LGTM!

@@ -14,6 +14,11 @@ def group_dim(self):
def group_size(self):
return self.quant_injector.group_size

def apply_input_view(self, x):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth making a GroupwiseQuantProxyMixin which provides these functions? Looks like that a lot is shared between GroupwiseWeightQuantProxyFromInjector & GroupwiseActQuantProxyFromInjector...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After further review, I seem to recall there was some issue with Proxy Mixins (perhaps with dependency injection) results in the ExportMixin needing to be in a certain location. I'll leave this for now - we can revisit at another time.

@@ -232,7 +238,8 @@ def process_input(self, inp):
if isinstance(inp, IntQuantTensor):
if is_quant_enabled and self.quant_metadata is None:
self.quant_metadata = _CachedIO(inp, metadata_only=True)
inp = inp.value
if isinstance(inp, QuantTensor):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When could this not be True? I see you've added this check, but it's not obvious to me. Can inp get modified in-place by _CachedIO?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the answer to my question is "no", so I've removed that check in 1f0933f.

@nickfraser
Copy link
Collaborator

nickfraser commented Sep 4, 2024

Rebased after #1012 was merged. Note, had to resolve some merge conflicts, but I think I made all the right choices about which bits to keep or not...

@nickfraser nickfraser self-requested a review September 4, 2024 14:29
@nickfraser
Copy link
Collaborator

The failing tests seem to be a github actions issue rather than a real issue. I can run the failing test suite (nox -s "tests_brevitas_cpu-3.9(jit_disabled, pytorch_2.1.0)") locally on a windows machine.

Merging, and will open another issue if this becomes a regular thing.

@nickfraser nickfraser merged commit d4834bd into Xilinx:dev Sep 5, 2024
336 of 337 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants