Can not reproduce claim that instruction fine-tuning does not change position of super weight #4

fxmarty-amd · 2024-11-29T09:58:33Z

I tried:

from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
import torch

# model_id = "huggyllama/llama-7B"
model_id = "mistralai/Mistral-7B-Instruct-v0.1"
# model_id = "meta-llama/Llama-2-7b-chat-hf"

tokenizer = AutoTokenizer.from_pretrained(model_id)

with torch.device("cuda"):
    model = AutoModelForCausalLM.from_pretrained(model_id)

model = model.eval()

inp = tokenizer("Summer is hot, winter is ", return_tensors="pt").to("cuda")

gen_config = GenerationConfig(
    max_new_tokens=100,
    min_new_tokens=100,
    use_cache=True,
    num_beams=1,
    do_sample=False,
)

with torch.no_grad():
    # model.model.layers[1].mlp.down_proj.weight[2533, 7890] = 0  # llama 2 7B
    model.model.layers[1].mlp.down_proj.weight[2070, 7310] = 0  # mistral 7B v0.1
    # model.model.layers[2].mlp.down_proj.weight[3968, 7003] = 0  # llama 1 7B

    res = model.generate(**inp, generation_config=gen_config)

print(tokenizer.batch_decode(res))

without qualitative deterioration of the output for mistralai/Mistral-7B-Instruct-v0.1 & meta-llama/Llama-2-7b-chat-hf.

I did reproduce the paper claim with huggyllama/llama-7B and with mistralai/Mistral-7B-v0.1 (I did not try meta-llama/Llama-2-7b).

Do you have some reference code / model to reproduce with?

Thank you!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can not reproduce claim that instruction fine-tuning does not change position of super weight #4

Can not reproduce claim that instruction fine-tuning does not change position of super weight #4

fxmarty-amd commented Nov 29, 2024 •

edited

Loading

Can not reproduce claim that instruction fine-tuning does not change position of super weight #4

Can not reproduce claim that instruction fine-tuning does not change position of super weight #4

Comments

fxmarty-amd commented Nov 29, 2024 • edited Loading

fxmarty-amd commented Nov 29, 2024 •

edited

Loading