Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large Language Model (LLM) Support #1731

Merged
merged 18 commits into from
Sep 5, 2023
Merged

Large Language Model (LLM) Support #1731

merged 18 commits into from
Sep 5, 2023

Conversation

rylev
Copy link
Collaborator

@rylev rylev commented Sep 4, 2023

This adds unstable support for a new Large Language Model "llm" interface to Spin that allows users to performance inferencing requests and generate embeddings.

The user can get this support by first opting into which models they want a component to be able to use in their spin.toml manifest:

ai_models = ["llama2-chat","all-minilm-l6-v2"]

Note: We will support "llama2-chat" and "codellama-instruct" for inferencing and "all-minilm-l6-v2" for generating embeddings.

They then must supply the model files themselves in a well known location: .spin/ai_models/$NAME_OF_MODEL.

Note: embeddings models are expected to be directories containing a tokenizer.json and model.safetensors file while inferencing is just the model file.

In their apps, they can use the llm interface through the Spin SDK:

let inference = llm::infer(llm::InferencingModel::Llama2Chat, "Tell me a story about Slats the cat.".into())?;

You can read more about this feature in SIP: [here].

rylev added 13 commits September 4, 2023 17:08
Signed-off-by: Ryan Levick <ryan.levick@fermyon.com>
Signed-off-by: Ryan Levick <ryan.levick@fermyon.com>
Signed-off-by: Ryan Levick <ryan.levick@fermyon.com>
Signed-off-by: Ryan Levick <ryan.levick@fermyon.com>
Signed-off-by: Ryan Levick <ryan.levick@fermyon.com>
Signed-off-by: Ryan Levick <ryan.levick@fermyon.com>
Signed-off-by: Ryan Levick <ryan.levick@fermyon.com>
Signed-off-by: Ryan Levick <ryan.levick@fermyon.com>
Signed-off-by: Ryan Levick <ryan.levick@fermyon.com>
Signed-off-by: Ryan Levick <ryan.levick@fermyon.com>
Signed-off-by: Ryan Levick <ryan.levick@fermyon.com>
Signed-off-by: Ryan Levick <ryan.levick@fermyon.com>
Signed-off-by: Ryan Levick <ryan.levick@fermyon.com>
Copy link
Member

@radu-matei radu-matei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a rebase and to get CI to be happy, then LGTM!

Signed-off-by: Radu Matei <radu.matei@fermyon.com>
Signed-off-by: Radu Matei <radu.matei@fermyon.com>
Signed-off-by: Radu Matei <radu.matei@fermyon.com>
Signed-off-by: Radu Matei <radu.matei@fermyon.com>
Signed-off-by: Radu Matei <radu.matei@fermyon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants