Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gguf: Fix special vocab handling when id < 0 #2984

Merged

Conversation

KerfuffleV2
Copy link
Collaborator

@KerfuffleV2 KerfuffleV2 commented Sep 3, 2023

The special vocab stuff in gguf didn't handle the case where the id for the special token was set to -1 in config.json (presumably to disable it).

This also handles the case where added_tokens in tokenizer_config.json looks like

    {
      "id": -1,
      "content": "<|PAD|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": true
    }

Doesn't seem that's likely to ever actually occur but might as well check. This is assuming -1 means "disable". I don't know if that's 100% correct but it seems reasonable?

Closes #2981

@KerfuffleV2 KerfuffleV2 added bug Something isn't working script Script related labels Sep 3, 2023
@KerfuffleV2 KerfuffleV2 merged commit 6519e9c into ggerganov:master Sep 3, 2023
@KerfuffleV2
Copy link
Collaborator Author

KerfuffleV2 commented Sep 3, 2023

According to #2896 would also need to make a gguf-0.3.2 tag here in the main repo to get the Python package publish. (Not sure I have that capability and I'm scared to mess with it for fear of breaking something.)

cebtenzzre added a commit to cebtenzzre/llama.cpp that referenced this pull request Oct 12, 2023
@KerfuffleV2 KerfuffleV2 deleted the fix-gguf-neg-special-token-id branch November 17, 2023 03:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working script Script related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pad_token_id of -1 is not understood by convert.py
2 participants