Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved quantize script #222

Merged
merged 9 commits into from
Mar 19, 2023
Merged

Conversation

SuajCarrot
Copy link
Contributor

I improved the quantize script by adding error handling and allowing to select many models for quantization at once in the command line. I also converted it to Python for generalization as well as extensibility.

I improved the quantize script by adding error handling and allowing to select many models for quantization at once in the command line. I also converted it to Python for generalization as well as extensibility.
@gjmulder gjmulder added the enhancement New feature or request label Mar 17, 2023
quantize.py Outdated

model_path = os.path.join("models", model, "ggml-model-f16.bin")

for i in os.listdir(model_path):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's another PR about parallelizing the quantizations. here it would be easy just to wrap this in a multiprocessing.Pool() (with the subprocess calls extracted out to a top-level function): https://docs.python.org/3/library/multiprocessing.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand exactly what can be parallelized here, is it quantizing many models at the same time? The code from the latest commit removed the loop that acts upon the incorrect listing of the file (for i in os.listdir(model_path)) by the way.

Fixed and improved many things in the script based on the reviews made by @mattsta. The parallelization suggestion is still to be revised, but code for it was still added (commented).
The original Bash script uses a glob pattern to match files that have endings such as ...bin.0, ...bin.1, etc. That has been translated correctly to Python now.
@sw
Copy link
Contributor

sw commented Mar 19, 2023

I think it's a good idea to remove the requirement for a Unix shell.
See also #285 which would be made obsolete by this, assuming the Python script works on Windows?

I suggest to remove quantize.sh and update the readme to use the python script.

New code to set the name of the quantize script binary depending on the platform has been added (quantize.exe if working on Windows) and the README.md file has been updated to use this script instead of the Bash one.
@SuajCarrot
Copy link
Contributor Author

Thank you for your comment @sw, I just pushed a commit that applies the changes you suggested. Let me know if there's anything else that should be done to achieve full compatibility with Windows.

Fixed a typo regarding the new filenames of the quantized models and removed the shell=True parameter in the subprocess.run call as it was conflicting with the list of parameters.
This was making the automatic help message to be suggesting the program's usage as being literally "$ Quantization Script [arguments]". It should now be something like "$ python3 quantize.py [arguments]".
@ggerganov
Copy link
Member

Should we merge now or wait for someone to test on Windows?

@SuajCarrot maybe keep the .sh for now and add a comment that it is deprecated. We will remove it later.

@ggerganov ggerganov merged commit 7392f1c into ggml-org:master Mar 19, 2023
@SuajCarrot
Copy link
Contributor Author

Thank you for merging! Should I create another pull request with the Bash script added back as well as its deprecation notice in the README?

@ggerganov
Copy link
Member

It was confirmed in #285 that it works on Windows, so no need to do it

@SuajCarrot
Copy link
Contributor Author

That's great, thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants