Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue load Llama tokenizer #20

Closed
NanoCode012 opened this issue May 8, 2023 · 5 comments
Closed

Issue load Llama tokenizer #20

NanoCode012 opened this issue May 8, 2023 · 5 comments

Comments

@NanoCode012
Copy link
Collaborator

Hello, I'm getting a weird issue loading tokenizer. I've checked that the line of code hasn't changed even on my latest pull. The only difference could be transformer source changed something.

https://github.com/winglian/axolotl/blob/7576d85c735e307fa1dbbcb8e0cba8b53bb1fa48/src/axolotl/utils/models.py#L138-L139

Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00,  7.88it/s]
Repo id must use alphanumeric chars or '-', '_', '.', '--' and '..' are forbidden, '-' and '.' cannot start or end the name, max length is 96: 'LlamaForCausalLM(
  (model): LlamaModel(
    (embed_tokens): Embedding(32000, 4096, padding_idx=0)
    (layers): ModuleList(
      (0-31): 32 x LlamaDecoderLayer(
        (self_attn): LlamaAttention(
          (q_proj): Linear8bitLt(in_features=4096, out_features=4096, bias=False)
          (k_proj): Linear8bitLt(in_features=4096, out_features=4096, bias=False)
          (v_proj): Linear8bitLt(in_features=4096, out_features=4096, bias=False)
          (o_proj): Linear8bitLt(in_features=4096, out_features=4096, bias=False)
          (rotary_emb): LlamaRotaryEmbedding()
        )
        (mlp): LlamaMLP(
          (gate_proj): Linear8bitLt(in_features=4096, out_features=11008, bias=False)
          (down_proj): Linear8bitLt(in_features=11008, out_features=4096, bias=False)
          (up_proj): Linear8bitLt(in_features=4096, out_features=11008, bias=False)
          (act_fn): SiLUActivation()
        )
        (input_layernorm): LlamaRMSNorm()
        (post_attention_layernorm): LlamaRMSNorm()
      )
    )
    (norm): LlamaRMSNorm()
  )
  (lm_head): Linear(in_features=4096, out_features=32000, bias=False)
)'.
Traceback (most recent call last):
  File "/workspace/src/axolotl/utils/models.py", line 140, in load_model
    tokenizer = LlamaTokenizer.from_pretrained(model)
@NanoCode012
Copy link
Collaborator Author

I can just use base_model_config, but I'm curious why this method is failing.

@winglian
Copy link
Collaborator

winglian commented May 8, 2023

are you pointing to a locally downloaded model? I've seen this issue when that's the case.

@NanoCode012
Copy link
Collaborator Author

NanoCode012 commented May 8, 2023

are you pointing to a locally downloaded model? I've seen this issue when that's the case.

No, I'm point to the HuggingFace repo, but I got it cached locally.

export HF_DATASETS_CACHE="/workspace/data/huggingface-cache/datasets"
export HUGGINGFACE_HUB_CACHE="/workspace/data/huggingface-cache/hub"

Thing is: it was working till I merged latest pulls. Although there were over 31 commits in just a few days, no code touched that line.

@winglian
Copy link
Collaborator

winglian commented May 8, 2023

what are you using as your config for base_model and base_model_config?

@NanoCode012
Copy link
Collaborator Author

what are you using as your config for base_model and base_model_config?

I pointed base_model to a HG repo and base_model_config is empty.

In the past, this would work. However, for some reason, it does not work anymore despite that line not being changed. I have changed to point both to the same path which works now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants