GPTQ - Move `quantized_model` to CUDA device #1535

samuel100 · 2024-12-31T10:38:48Z

Describe your changes

When using GPTQ the quantized_model must be moved to the CUDA device to avoid the "Expected all tensors to be on the same device" error in auto-gptq. See AutoGPTQ/AutoGPTQ#729

Checklist before requesting a review

Add unit tests for this change.
Make sure all tests can pass.
Update documents if necessary.
Lint and apply fixes to your code by running lintrunner -a
Is this a user-facing change? If yes, give a description of this change to be included in the release notes.
Is this PR including examples changes? If yes, please remember to update example documentation in a follow-up PR.

xiaoyu-work · 2025-01-03T05:55:23Z

According to the discussion thread, it seems this was already fixed by AutoGPTQ/AutoGPTQ#607?

jambayk · 2025-01-06T19:29:32Z

Like @xiaoyu-work mentioned, this issue should be fixed if you install autogptq from source https://github.com/AutoGPTQ/AutoGPTQ?tab=readme-ov-file#install-from-source. Could you try it to see if it works?

samuel100 · 2025-01-07T16:39:14Z

@xiaoyu-work @jambayk -- I tested building from source and confirm that it fixed the issue. I'll close this PR and create a new one that will update the documentation with instructions.

samuel100 added 3 commits December 30, 2024 17:15

explicity move quantized model to cuda device

4218e36

add gh issue to comment

307c960

fixed linting

b474b8b

samuel100 closed this Jan 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPTQ - Move `quantized_model` to CUDA device #1535

GPTQ - Move `quantized_model` to CUDA device #1535

samuel100 commented Dec 31, 2024

xiaoyu-work commented Jan 3, 2025

jambayk commented Jan 6, 2025

samuel100 commented Jan 7, 2025 •

edited

Loading

GPTQ - Move quantized_model to CUDA device #1535

GPTQ - Move quantized_model to CUDA device #1535

Conversation

samuel100 commented Dec 31, 2024

Describe your changes

Checklist before requesting a review

xiaoyu-work commented Jan 3, 2025

jambayk commented Jan 6, 2025

samuel100 commented Jan 7, 2025 • edited Loading

GPTQ - Move `quantized_model` to CUDA device #1535

GPTQ - Move `quantized_model` to CUDA device #1535

samuel100 commented Jan 7, 2025 •

edited

Loading