Return full details of generation including discarded tokens via API #623

blueyred · 2024-01-18T11:11:13Z

blueyred
Jan 18, 2024

This may be too "blue sky" but would it be possible to return all the discarded tokens from a generation in some form of multidimensional array via the API?

The aim would be to allow the user to view and possibly regenerate using a discarded token.

An example of this is from NovelAI:
https://github.com/LostRuins/koboldcpp/assets/84193813/884cbe47-e5ec-4373-bc50-59bb54d4f39c

My naive idea would be to store all the tokens generated (candidates->data ??) when running the sample_tokens and then provide that as a new "generate_fulltokens" endpoint.

noisefloordev · 2024-04-27T17:19:47Z

noisefloordev
Apr 27, 2024

This could be done with a stream output mode that returns a single JSONL line per output token. This would allow including detailed info about the output, and make front-ends that want to work with the output stream in more detailed ways much easier.

For example:

{ "id": 413, "string": "turn", "top_picks": [{ (token id, string and probability for each) }] }
...
{ "id": 128009, "string": "<|im_end|>", special: true, "top_picks": [...] }

This would allow a bunch of things:

It'd make supporting "pick your own token" possible. It's a really neat way to guide generation.
It gives a robust solution to Special tokens are not rendered -> can't stop based on them #791, for cases where front-ends really do want to see special tokens.
special: true allows the front-end to filter them out of the output if it doesn't want them (without having to know what they are).
It allows you to tell why the response stopped: you know the response is complete since it ends with the stop token. Currently you can call /api/extra/perf and look at stop_reason, but that's a kludge and probably won't work for multi-user.

I think it'd be really interesting.

0 replies

LostRuins · 2024-04-27T17:22:15Z

LostRuins
Apr 27, 2024
Maintainer

What you're asking for is basically returning token logprobs, which is a desired feature but I have not had the time to prioritize it.

https://platform.openai.com/docs/api-reference/chat/create#chat-create-logprobs

It's on the low priority backlog

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return full details of generation including discarded tokens via API #623

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Return full details of generation including discarded tokens via API #623

blueyred Jan 18, 2024

Replies: 2 comments

noisefloordev Apr 27, 2024

LostRuins Apr 27, 2024 Maintainer

blueyred
Jan 18, 2024

noisefloordev
Apr 27, 2024

LostRuins
Apr 27, 2024
Maintainer