Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add exllamav2 better #27111

Merged
merged 38 commits into from
Nov 1, 2023
Merged

Add exllamav2 better #27111

merged 38 commits into from
Nov 1, 2023

Conversation

SunMarc
Copy link
Member

@SunMarc SunMarc commented Oct 27, 2023

What does this PR do ?

This PR is a modified version of this PR that make disable_exllama go through a deprecation cycle.

I also fixed the following test test_device_and_dtype_assignment that broke other tests in the CI introduced by this PR.

I confirm that all the tests are green

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Oct 27, 2023

The documentation is not available anymore as the PR was closed or merged.

@SunMarc
Copy link
Member Author

SunMarc commented Oct 27, 2023

Since @ArthurZucker is out next week, it would be great if you could review this PR @amyeroberts. I'm trying to have this PR in the next release. In this modified version, I make sure to deprecate disable_exllama arg in favor of use_exllama.

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this!

Two general comments:

  • Having two additional config arguments isn't ideal. There should be ideally a single parameter which configures this behaviour and is "off" when not set
  • There needs to be argument verification for safe deprecation of the old flag

docs/source/en/main_classes/quantization.md Show resolved Hide resolved
docs/source/en/main_classes/quantization.md Outdated Show resolved Hide resolved
src/transformers/utils/quantization_config.py Outdated Show resolved Hide resolved
src/transformers/utils/quantization_config.py Outdated Show resolved Hide resolved
src/transformers/utils/quantization_config.py Outdated Show resolved Hide resolved
src/transformers/utils/quantization_config.py Outdated Show resolved Hide resolved
src/transformers/utils/quantization_config.py Show resolved Hide resolved
src/transformers/utils/quantization_config.py Show resolved Hide resolved
SunMarc and others added 3 commits October 30, 2023 14:43
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
@SunMarc
Copy link
Member Author

SunMarc commented Oct 31, 2023

Thanks for the review @amyeroberts . I've addressed all the points. LMK if something is missing !

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating on this!

There's at least one more iteration of the input argument logic - but we're close! Otherwise code and PR looks v. good

src/transformers/utils/quantization_config.py Show resolved Hide resolved
src/transformers/utils/quantization_config.py Outdated Show resolved Hide resolved
src/transformers/utils/quantization_config.py Outdated Show resolved Hide resolved
src/transformers/modeling_utils.py Outdated Show resolved Hide resolved
src/transformers/utils/quantization_config.py Outdated Show resolved Hide resolved
src/transformers/utils/quantization_config.py Show resolved Hide resolved
src/transformers/utils/quantization_config.py Outdated Show resolved Hide resolved
SunMarc and others added 2 commits October 31, 2023 15:48
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
@SunMarc
Copy link
Member Author

SunMarc commented Oct 31, 2023

Thanks for the deep review @amyeroberts ! I've added the input logic and simplified the link with optimum config.

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great - thanks for the work iterating on this!

Just a few nits were the docs need to be updated. Otherwise LGTM!

src/transformers/utils/quantization_config.py Outdated Show resolved Hide resolved
src/transformers/utils/quantization_config.py Outdated Show resolved Hide resolved
docs/source/en/main_classes/quantization.md Outdated Show resolved Hide resolved
docs/source/en/main_classes/quantization.md Outdated Show resolved Hide resolved
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
@SunMarc
Copy link
Member Author

SunMarc commented Nov 1, 2023

Thanks again @amyeroberts for iterating on this PR in such a short time !

@SunMarc SunMarc merged commit c9e72f5 into huggingface:main Nov 1, 2023
21 checks passed
EduardoPach pushed a commit to EduardoPach/transformers that referenced this pull request Nov 19, 2023
* add_ xllamav2 arg

* add test

* style

* add check

* add doc

* replace by use_exllama_v2

* fix tests

* fix doc

* style

* better condition

* fix logic

* add deprecate msg

* deprecate exllama

* remove disable_exllama from the linter

* remove

* fix warning

* Revert the commits deprecating exllama

* deprecate disable_exllama for use_exllama

* fix

* fix loading attribute

* better handling of args

* remove disable_exllama from init and linter

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* better arg

* fix warning

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* switch to dict

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* style

* nits

* style

* better tests

* style

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants