Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Bitblas Kernel not compatbile with bitblas > 0.0.1-dev13 #1192

Open
Qubitium opened this issue Jan 31, 2025 · 1 comment
Open

[BUG] Bitblas Kernel not compatbile with bitblas > 0.0.1-dev13 #1192

Qubitium opened this issue Jan 31, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@Qubitium
Copy link
Collaborator

Qubitium commented Jan 31, 2025

@LeiWang1999 Happy Chinese New Years! When you have time, can you point to us or help us find the cause of bitblas v0.1.0 issue. Thanks!

For GPTQModel v1.8.0-dev release I am trying to migrate gptqmodel's locked support for bitblas from working 0.0.1-dev13 to 0.1.0 but running into multiple issues.

It appears since bitblas version >= 0.0.1-dev14 bitblas kernel is no longer compatible and throws shape mistmatch errors.

I tried to migrate the code to bitblas kernel and pack code to bitblas 0.1.0 in PR and failed #1184 (now partially reverted back to 0.0.1-dev13 state) with Segmentation fault core dumped at the self.bitblas_matmul.lib.call(). 0.1.0 no longer has call_lib but we see there is a lib.call but trying to use it just seg faults.

Please use the following code to replicate:

  1. bitblas 0.0.1-dev13 <-- working
  2. bitblas 0.0.1-dev14 <-- shape exceptions
  3. bitblas 0.1.0 (requires checkout of Update/Refractor Bitblas/Marlin/Cuda #1184) but segfaults: https://github.com/ModelCloud/GPTQModel/pull/1184/files#diff-47042b63158eec98a67b6a85c5a28e5c0623b67fce9823ab425156f4d71e738a
from gptqmodel import GPTQModel, BACKEND

model = GPTQModel.load(
    "ModelCloud/Qwen2.5-0.5B-Instruct-ci-test-bitblas",
    backend=BACKEND.BITBLAS, # change to BACKEND.TORCH or MARLIN for normal generate
)

print(model.tokenizer.decode(model.generate("What is the capital of United States?")[0]))

p.s. GPTQModel was recently merged into HF Transformers main so the next Transformers non-patch release should have wide-support of optional bitblas support for all gptq models with backend toggle!

@Qubitium Qubitium added the bug Something isn't working label Jan 31, 2025
@LeiWang1999
Copy link
Contributor

sure, I'll take a look

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants