Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue link: https://github.com/janhq/cortex/issues/467
We use mode_id as a key to find the model, so they have to be unique. That requires some changes in request parameter:
For completions:
Have to set value the model_alias for inferences/llamaCPP/loadmodel and inferences/llamaCPP/unloadmodel , this has to be the same as model parameter in inferences/llamaCPP/chat_completion
loadmodel/unloadmodel:
chat_completion:
For embeddings:
The same for loadmodel/unloadmodel, for embedding request, we need to add model parameter
loadmodel/unloadmodel
embedding