Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: speculative sampling #1013

Closed
mudler opened this issue Sep 5, 2023 · 1 comment · Fixed by #1052
Closed

feat: speculative sampling #1013

mudler opened this issue Sep 5, 2023 · 1 comment · Fixed by #1052
Assignees
Labels
enhancement New feature or request roadmap

Comments

@mudler
Copy link
Owner

mudler commented Sep 5, 2023

Is your feature request related to a problem? Please describe.
llama.cpp allows speculative sampling - that is using big models with small models to be able to run big models on constrained HW

Describe the solution you'd like
This is just a tracker

@mudler
Copy link
Owner Author

mudler commented Sep 9, 2023

what is left: exposing draft_model and speculative sampling to LocalAI model config

@mudler mudler added the roadmap label Sep 14, 2023
mudler added a commit that referenced this issue Sep 14, 2023
…el config (#1052)

**Description**

This PR fixes #1013.

It adds `draft_model` and `n_draft` to the model YAML config in order to
load models with speculative sampling. This should be compatible as well
with grammars.

example:

```yaml
backend: llama                                                                                                                                                                   
context_size: 1024                                                                                                                                                                        
name: my-model-name
parameters:
  model: foo-bar
n_draft: 16                                                                                                                                                                      
draft_model: model-name
```

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request roadmap
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant