Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Post Quantization for nllb-models #19

Open
Arnab1181412 opened this issue Jun 23, 2023 · 1 comment
Open

Post Quantization for nllb-models #19

Arnab1181412 opened this issue Jun 23, 2023 · 1 comment

Comments

@Arnab1181412
Copy link

Hi @Vahe1994,

I have fine-tuned a facebook's nllb model on my custom dataset for language translation. Could you provide a guideline on how to preform SpQR of this fine-tuned model? Specifically, I am interested in post-quantization methodologies.

Thanks in advance and great work implementing SpQR

@Vahe1994
Copy link
Owner

Vahe1994 commented Jul 5, 2023

Hello!
Sorry for late answer. Unfortunately we did not try SpQR the SpQR technique on an encoder-decoder type models. While it is speculative on my part, I believe that since SpQR (similar to GPTQ) performs quantization per layer, the encoder component in "ecoder-decoder" of the model would require minimal changes to be compatible with SpQR (such as adjusting namings and potentially caching, as seen in this code snippet: https://github.com/Vahe1994/SpQR/blob/1c27ed6294d31f8f508ef02f95fb2bac0337d0a6/main.py#L114C46-L114C47). However, the decoder component would need to store the last activation from the encoder in order to calculate the input and output of the linear layer in the decoder blocks. If you have the input, output, and weights, you can run the SpQR engine on the layer. Therefore, the main part that requires modification is in the main.py file, where you need to retrieve the input and output for each the linear layer that you want to quantize.

You can take a look at the example for t5 for GPTQ for reference https://github.com/qwopqwop200/GPTQ-for-LLaMa/blob/t5/t5.py . If you encounter any problem please let us know, we will try to help you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants