gguf: Fix special vocab handling when id < 0 #2984

KerfuffleV2 · 2023-09-03T07:44:44Z

The special vocab stuff in gguf didn't handle the case where the id for the special token was set to -1 in config.json (presumably to disable it).

This also handles the case where added_tokens in tokenizer_config.json looks like

    {
      "id": -1,
      "content": "<|PAD|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": true
    }

Doesn't seem that's likely to ever actually occur but might as well check. This is assuming -1 means "disable". I don't know if that's 100% correct but it seems reasonable?

Closes #2981

KerfuffleV2 · 2023-09-03T10:44:19Z

According to #2896 would also need to make a gguf-0.3.2 tag here in the main repo to get the Python package publish. (Not sure I have that capability and I'm scared to mess with it for fear of breaking something.)

This reverts commit 6519e9c.

gguf: Fix special vocab handling when id < 0

c3d3ea9

KerfuffleV2 added bug Something isn't working script Script related labels Sep 3, 2023

klosax approved these changes Sep 3, 2023

View reviewed changes

KerfuffleV2 merged commit 6519e9c into ggerganov:master Sep 3, 2023

cebtenzzre added a commit to cebtenzzre/llama.cpp that referenced this pull request Oct 12, 2023

gguf-py : revert ggerganov#2984 handling of negative token IDs

11a8e3e

This reverts commit 6519e9c.

cebtenzzre mentioned this pull request Oct 12, 2023

pad_token_id of -1 is not understood by convert.py #2981

Closed

KerfuffleV2 deleted the fix-gguf-neg-special-token-id branch November 17, 2023 03:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gguf: Fix special vocab handling when id < 0 #2984

gguf: Fix special vocab handling when id < 0 #2984

KerfuffleV2 commented Sep 3, 2023 •

edited

Loading

KerfuffleV2 commented Sep 3, 2023 •

edited

Loading

gguf: Fix special vocab handling when id < 0 #2984

gguf: Fix special vocab handling when id < 0 #2984

Conversation

KerfuffleV2 commented Sep 3, 2023 • edited Loading

KerfuffleV2 commented Sep 3, 2023 • edited Loading

KerfuffleV2 commented Sep 3, 2023 •

edited

Loading

KerfuffleV2 commented Sep 3, 2023 •

edited

Loading