Sparsegpt #374

Motsepe-Jr · 2023-06-08T19:37:10Z

This is the sparseGPT code based on IST-DASLab project.

I followed the same coding principles as used in the lit-llama gptq code.

I created a file called sparsification which is the algorithm for SparseGPT and a folder called sparsify/sparsegpt.py to run the algorithm on the model in the checkpoint_path.

This is my first contribution to the project, If I missed some household admin I apologize in advance.

Assuming you have a model under checkpoints/open-llama/7B

you can run this command:
python sparsify/sparsegpt.py --checkpoint_path checkpoints/lit-llama/7B/lit-llama.pth

Key Notes:

I used half n_samples (128-->64) due to memory requirement
The SparseGPT paper was evaluated on models trained not using the chinchilla scaling law (Therefore my hypothesis is that some of the weights of those models were not useful, hence they were able to prune 50%). With Llama I only used 0.1 target sparsity.)
The source code of SparseGPT consists of a quantization algorithm similar to GPTQ, however, I removed this code because we already have GPTQ in the lit-llama source code. If you would like me to include it, it is also okay I can include GPTQ under the sparseGPT code.

Before you commit, please also test from your side, and let me know if you want me to solve any bug or integrate a specific feature

Thanks

This is the sparseGPT code based on IST-DASLab project. I followed the same coding principles as used in the lit-llama gptq code. I created a file called sparsification which is the algorithm for SparseGPT and a folder called sparsify/sparsegpt.py to run the algorithm on the the model in the checkpoint_path. This is my first contribution to the project, If I missed some household admin I apologise in advance. Key Notes: 1. The source code of SparseGPT consist of the quantization algorithm similar to GPTQ, however I removed this code because we already have GPTQ in the lit-llama source code. 2. I'm still in the waiting list for the Llama weights 7B.

This is the sparseGPT code based on IST-DASLab project. I followed the same coding principles as used in the lit-llama gptq code. I created a file called sparsification which is the algorithm for SparseGPT and a folder called sparsify/sparsegpt.py to run the algorithm on the model in the checkpoint_path. This is my first contribution to the project, If I missed some household admin I apologize in advance. Assuming you have a model under checkpoints/open-llama/7B you can run this command: python sparsify/sparsegpt.py --checkpoint_path checkpoints/lit-llama/7B/lit-llama.pth Key Notes: 0. I used half n_samples (128-->64) due to memory requirement 1. The SparseGPT paper was evaluated on models trained not using the chinchilla scaling law (Therefore my hypothesis is that some of the weights of those model were not useful, hence they were able to prune 50%). With Llama I only used 0.1 target sparsity.) 2. The source code of SparseGPT consist of a quantization algorithm similar to GPTQ, however, I removed this code because we already have GPTQ in the lit-llama source code. If you would like me to include it, it is also okay I can include GPTQ under sparseGPT code. Before you commit, please also test from your side, and let me know if you want me to solve any bug or integrate a specific feature Thanks

Motsepe-Jr added 2 commits June 3, 2023 16:03

Motsepe-Jr requested review from awaelchli, carmocca and lantiga as code owners June 8, 2023 19:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sparsegpt #374

Sparsegpt #374

Motsepe-Jr commented Jun 8, 2023 •

edited

Loading

Sparsegpt #374

Are you sure you want to change the base?

Sparsegpt #374

Conversation

Motsepe-Jr commented Jun 8, 2023 • edited Loading

Motsepe-Jr commented Jun 8, 2023 •

edited

Loading