-
Notifications
You must be signed in to change notification settings - Fork 27.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix quantization tests #29914
Fix quantization tests #29914
Conversation
custom_mapping_model_id = "TheBloke/Mistral-7B-v0.1-AWQ" | ||
custom_model_revision = "f186bcfa9edbe2a4334262ec1e67f23e53ed1ae7" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed to a mistral model since the yi model is based on remote code and we had breaking changes ... (Dynamic Cache). This should be fine since we want to test if we can pass custom fuse mapping.
@@ -43,7 +46,8 @@ RUN python3 -m pip install --no-cache-dir git+https://github.com/huggingface/opt | |||
RUN python3 -m pip install --no-cache-dir aqlm[gpu]==1.0.2 | |||
|
|||
# Add autoawq for quantization testing | |||
RUN python3 -m pip install --no-cache-dir https://github.com/casper-hansen/AutoAWQ/releases/download/v0.2.0/autoawq-0.2.0+cu118-cp38-cp38-linux_x86_64.whl | |||
# >=v0.2.3 needed for compatibility with torch 2.2.1 | |||
RUN python3 -m pip install --no-cache-dir https://github.com/casper-hansen/AutoAWQ/releases/download/v0.2.3/autoawq-0.2.3+cu118-cp38-cp38-linux_x86_64.whl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With torch2.2.1, we need a newer version of autoawq. Otherwise, we have issues importing the kernels =(
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
This reverts commit 399a5f9.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally would probably try not to have new notification files, but I understand it's not easy to make the necessary changes in the currently existing notification_service.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, at the end of the day, I think it is easier to recreate a notification file since this is a different CI and I wanted to have a detailed overview of the results just like what you did for models. Putting everything in notification_service.py
can be done and, it will be a little bit complex.
Thank you @SunMarc on this work, super appreciated! From the run page: https://github.com/huggingface/transformers/actions/runs/8483098195/job/23243697001 I don't have specific question but just a slight question on the notification stuff. |
text = f"{self.n_failures} failures out of {self.n_tests} tests," if self.n_failures else "All tests passed." | ||
|
||
self.thread_ts = client.chat_postMessage( | ||
channel="#transformers-ci-daily-quantization", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, I didn't know this works. Nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I learned this from the peft/accelerate CI log report here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good on my side. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much @SunMarc for fixing the quantization tests and refactoring the workflow!
Feel free to merge your PR @ydshieh ! Yes, that would be great if you could resolve the conflits if it is not too complicated for you ! |
Happy to help - if it is difficult for me , I guess it would be even difficult to you 😆 (to understand what I changed in my PR) |
@ArthurZucker No need to review until I request a reivew (after rebasing this PR on #30012) |
Back to this next week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good thank you 🙂
run: | | ||
echo "quantization_matrix=$(python3 -c 'import os; tests = os.getcwd(); quantization_tests = os.listdir(os.path.join(tests, "quantization")); d = sorted(list(filter(os.path.isdir, [f"quantization/{x}" for x in quantization_tests]))) ; print(d)')" >> $GITHUB_OUTPUT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(py39) arthur@hf-dgx-01:~/transformers/tests$ find "$(pwd)/quantization" -maxdepth 1 -type d | sort
/home/arthur/transformers/tests/quantization
/home/arthur/transformers/tests/quantization/aqlm_integration
/home/arthur/transformers/tests/quantization/autoawq
/home/arthur/transformers/tests/quantization/bnb
/home/arthur/transformers/tests/quantization/gptq
/home/arthur/transformers/tests/quantization/quanto_integration
seems a lot simpler no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$(find "$(pwd)/quantization" -mindepth 1 -maxdepth 1 -type d | sort)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes indeed ! I'll switch to your solution =)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure we don't have the prefix (here /home/arthur/transformers/
but whatever it is in a system) in the outputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again @SunMarc !
I will keep the new notification file at this moment as you created. Integrate it to the existing one is good but not urgent (and it takes some time 😆 )
* revert back to torch 2.1.1 * run test * switch to torch 2.2.1 * udapte dockerfile * fix awq tests * fix test * run quanto tests * update tests * split quantization tests * fix * fix again * final fix * fix report artifact * build docker again * Revert "build docker again" This reverts commit 399a5f9. * debug * revert * style * new notification system * testing notfication * rebuild docker * fix_prev_ci_results * typo * remove warning * fix typo * fix artifact name * debug * issue fixed * debug again * fix * fix time * test notif with faling test * typo * issues again * final fix ? * run all quantization tests again * remove name to clear space * revert modfiication done on workflow * fix * build docker * build only quant docker * fix quantization ci * fix * fix report * better quantization_matrix * add print * revert to the basic one
* revert back to torch 2.1.1 * run test * switch to torch 2.2.1 * udapte dockerfile * fix awq tests * fix test * run quanto tests * update tests * split quantization tests * fix * fix again * final fix * fix report artifact * build docker again * Revert "build docker again" This reverts commit 399a5f9. * debug * revert * style * new notification system * testing notfication * rebuild docker * fix_prev_ci_results * typo * remove warning * fix typo * fix artifact name * debug * issue fixed * debug again * fix * fix time * test notif with faling test * typo * issues again * final fix ? * run all quantization tests again * remove name to clear space * revert modfiication done on workflow * fix * build docker * build only quant docker * fix quantization ci * fix * fix report * better quantization_matrix * add print * revert to the basic one
What does this PR do ?
This PR fixes all failing tests in the quantization CI. We also split the workflow so that we have one job per quantization method and device.
You can see that everything works fine from
cc @ydshieh @younesbelkada