feat: multiple models #14

vansangpfiev · 2024-05-04T03:45:12Z

Issue link: menloresearch/cortex.cpp#467

use std::unordered_map to store all llama_server_context (lsc)
refactor: move background thread to lsc, each lsc has its own background thread

We use mode_id as a key to find the model, so they have to be unique. That requires some changes in request parameter:
For completions:
Have to set value the model_alias for inferences/llamaCPP/loadmodel and inferences/llamaCPP/unloadmodel , this has to be the same as model parameter in inferences/llamaCPP/chat_completion
loadmodel/unloadmodel:

{
...
 "llama_model_path": "file_to_location"
  "model_alias": "model1"
...
}

chat_completion:

{
...
  "model": "model1"  
...
}

For embeddings:
The same for loadmodel/unloadmodel, for embedding request, we need to add model parameter
loadmodel/unloadmodel

{
...
    "llama_model_path": "e:/workspace/model/nomic-embed-text-v1.5.f16.gguf",
    "model_alias": "model1",
    "model_type": "embedding"
...
}

embedding

{
...  
  "input": "how are you",
  "model": "model1"
...
}

tikikun

LGTM!!!

vansangpfiev force-pushed the feat/load-multiple-models branch from fc39ddf to 2f281bb Compare May 7, 2024 02:56

vansangpfiev marked this pull request as ready for review May 7, 2024 08:43

vansangpfiev self-assigned this May 7, 2024

vansangpfiev requested a review from tikikun May 7, 2024 08:56

vansangpfiev mentioned this pull request May 8, 2024

feat: load multiple models menloresearch/cortex.cpp#495

Closed

feat: multiple models

bf49809

vansangpfiev force-pushed the feat/load-multiple-models branch from 0343c90 to bf49809 Compare May 11, 2024 09:22

tikikun approved these changes May 13, 2024

View reviewed changes

vansangpfiev merged commit 4ad76ba into main May 13, 2024

vansangpfiev deleted the feat/load-multiple-models branch December 31, 2024 12:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: multiple models #14

feat: multiple models #14

Uh oh!

vansangpfiev commented May 4, 2024 •

edited

Loading

Uh oh!

tikikun left a comment

Uh oh!

Uh oh!

feat: multiple models #14

feat: multiple models #14

Uh oh!

Conversation

vansangpfiev commented May 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tikikun left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vansangpfiev commented May 4, 2024 •

edited

Loading