Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sparsegpt #374

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Sparsegpt #374

wants to merge 2 commits into from

Commits on Jun 3, 2023

  1. Sparse GPT Algorithm

    This is the sparseGPT code based on IST-DASLab project.
    
    I followed the same coding principles as used in the lit-llama gptq code.
    
    I created a file called sparsification  which is the algorithm for SparseGPT and a folder called sparsify/sparsegpt.py to run the algorithm on the the model in the checkpoint_path.
    
    This is my first contribution to the project, If I missed some household admin I apologise in advance.
    
    Key Notes:
    
        1. The source code of SparseGPT consist of the quantization algorithm similar to GPTQ, however I removed this code because we already have GPTQ in the lit-llama source code.
    
    2. I'm still in the waiting list for the Llama weights 7B.
    Motsepe-Jr committed Jun 3, 2023
    Configuration menu
    Copy the full SHA
    c2dff33 View commit details
    Browse the repository at this point in the history

Commits on Jun 8, 2023

  1. sparseGPT

    This is the sparseGPT code based on IST-DASLab project.
    
    I followed the same coding principles as used in the lit-llama gptq code.
    
    I created a file called sparsification which is the algorithm for SparseGPT and a folder called sparsify/sparsegpt.py to run the algorithm on the model in the checkpoint_path.
    
    This is my first contribution to the project, If I missed some household admin I apologize in advance.
    
    Assuming you have a  model under checkpoints/open-llama/7B
    
    you can run this command:
    python sparsify/sparsegpt.py --checkpoint_path checkpoints/lit-llama/7B/lit-llama.pth
    
    Key Notes:
    
    0. I used half n_samples (128-->64) due to memory requirement
    
    1. The SparseGPT paper was evaluated on models trained not using the chinchilla scaling law (Therefore my hypothesis is that some of the weights of those model were not useful, hence they were able to prune 50%). With Llama I only used 0.1 target sparsity.)
    
    2. The source code of SparseGPT consist of a quantization algorithm similar to GPTQ, however, I removed this code because we already have GPTQ in the lit-llama source code. If you would like me to include it, it is also okay I can include GPTQ under sparseGPT code.
    
    Before you commit, please also test from your side, and let me know if you want me to solve any bug or integrate a specific feature
    
    Thanks
    Motsepe-Jr committed Jun 8, 2023
    Configuration menu
    Copy the full SHA
    a492ad0 View commit details
    Browse the repository at this point in the history