Unify Logits Processors, Ensure Tokenizers Have Identical Interfaces #676

lapp0 · 2024-02-16T23:41:40Z

Implements Step 1 of #678

Changes:

Ensure LlamaCpp and vLLM use the same logits processors
Ensure LlamaCpp and vLLM both use a tokenizer subclassing outlines.models.tokenizer.Tokenizer
- don't monkeypatch vLLM tokenizer in the logits processor itself

Benefits:

No more bespoke logits processors, a single canonical logits processor for all operations which share identical logic across inference engines.
- Enables a path forward for easily implementing repetition penalty logits processor among others.
Provides a path forward for implementing outlines.models.vllm.

TODO

Unify vLLM and LlamaCpp Logits Processors
outlines.models.llamacpp using subclass of outlines.models.tokenizer.Tokenizer
New outlines.models.vllm with only the Tokenizer
Test cases for all interfaces of outlines.models.tokenizer.Tokenizer using all inference engines
smoke test vLLM
document new vLLM logits processor instantiation logic

Provides us a method of easily a single implementation of logits processors usable by any model. Enables a path forward for repetition penalty logits processor among others.

rlouf · 2024-02-17T07:08:13Z

Please open an issue before opening a PR when the change is about the design of the library. Here I don't think it is worth generalising the processors at this point. Let's wait until we have another integration (TensorRT?).

lapp0 · 2024-02-17T19:47:50Z

Please open an issue before opening a PR when the change is about the design of the library. Here I don't think it is worth generalising the processors at this point. Let's wait until we have another integration (TensorRT?).

Created and linked an issue.

I disagree, unifying their implementation will make it easier to integrate more inference engines while ensuring they function properly with the rest of the system and don't accumulate additional technical debt.

rlouf · 2024-02-20T07:55:07Z

Beam search is broken with ExLlamaV2, will address in Step 3.

Unless it has implications for the redesign please open a separate issue and remove this comment. Issues are actionable items, comments imply that this needs to be addressed in the current PR.

As the number of contributions increases we have to be more disciplined when it comes to issues, discussions and PRs. I will add contribution guidelines.

lapp0 · 2024-02-20T08:31:11Z

Sounds good, I'll open a separate issue.

rlouf · 2024-02-21T14:41:55Z

Closing for now, until the discussion in #678 has converged

Andrew Lapp added 2 commits February 16, 2024 17:34

Add Base Logits Processors

a9a4367

fix tokenizer, remove vllm changes

e14af7e

lapp0 mentioned this pull request Feb 17, 2024

Unify Model Interfaces #678

Closed

lapp0 changed the title ~~Unified outlines.processors~~ Unify Logits Processors, Ensure Tokenizers Have Identical Interfaces Feb 17, 2024

lapp0 mentioned this pull request Feb 19, 2024

Very slow crawl in interegular, scalability issue #680

Open

rlouf closed this Feb 21, 2024

lapp0 mentioned this pull request Jun 1, 2024

Unify Tokenizer Behavior and Ensure Sane Interfaces #936

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unify Logits Processors, Ensure Tokenizers Have Identical Interfaces #676

Unify Logits Processors, Ensure Tokenizers Have Identical Interfaces #676

lapp0 commented Feb 16, 2024 •

edited

Loading

rlouf commented Feb 17, 2024 •

edited

Loading

lapp0 commented Feb 17, 2024

rlouf commented Feb 20, 2024

lapp0 commented Feb 20, 2024

rlouf commented Feb 21, 2024

Unify Logits Processors, Ensure Tokenizers Have Identical Interfaces #676

Unify Logits Processors, Ensure Tokenizers Have Identical Interfaces #676

Conversation

lapp0 commented Feb 16, 2024 • edited Loading

rlouf commented Feb 17, 2024 • edited Loading

lapp0 commented Feb 17, 2024

rlouf commented Feb 20, 2024

lapp0 commented Feb 20, 2024

rlouf commented Feb 21, 2024

lapp0 commented Feb 16, 2024 •

edited

Loading

rlouf commented Feb 17, 2024 •

edited

Loading