Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SW in LLaMA2-7B #2

Open
kiucho opened this issue Nov 19, 2024 · 1 comment
Open

SW in LLaMA2-7B #2

kiucho opened this issue Nov 19, 2024 · 1 comment

Comments

@kiucho
Copy link

kiucho commented Nov 19, 2024

Thank you for sharing your excellent research.

I attempted to implement the identification of spurious weights (SW) in LLaMA2-7B and have a question. When analyzing the input and output, I observed the following graph, which suggests the presence of SW in layer 1. The input value was approximately 1400, but the output exceeded 2000.
image

Upon investigation, I identified the indices of the SW as [2533, 7890], which aligns with the findings reported in your paper.

Next, I generated another graph after removing the SW at [2533, 7890] from the mlp.down_proj of layer 1. This resulted in the following graph:
image

This led me to wonder if there might be additional SW. For example, in layer 30, the initial value was 10,000, which dropped to -17,500. However, according to your paper, there is only one identified SW at [2533, 7890] in the mlp.down_proj of layer 1.

Could you please provide the exact algorithm for identifying SW?

@mengxiayu
Copy link
Owner

We had the same hypothesis and tried to identify the SW in late layers. We had two hypotheses: (1) there are late-layer SW; (2) early-layer SW and late-layer SW appear as a pair (i.e., removing both of them doesn't hurt the model quality).

For Llama-7B, we were unable to identify a single or a few weights that eliminate the sparks you showed. Results suggest it might be a whole column of weights. For OLMo-7B, we were able to identify a few outlier weights that eliminate the sparks. However, they didn't have the significant impact on model quality as early-layer SW. Therefore, we decided not to include them as super weights.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants