Create json api service #88

wizd · 2023-03-13T10:19:23Z

so we can intergrate app/UI.

wizd · 2023-03-13T19:01:50Z

emulate openai text api, so tons of apps could support llama without change.

henk717 · 2023-03-14T01:59:35Z

+1 on this, people would love to have this in KoboldAI but we have no good way of implementing it at the moment.
We already have OpenAI support so that would work, we also have a different basic json API that just sends the desired values over json and handles the output string.

Whatever way works, but doing json over http is going to be ideal for cross language implementations such as python or (in browser) javascript.

MLTQ · 2023-03-14T20:07:20Z

Sounds like the ideal structure of this would be to load the model into memory in interactive mode, listen for input on some port, then wait for initial prompt & reverse prompt, then post the json response to that same port.
Because it outputs word by word, maybe a websocket implementation?

This seems like a viable option too: #23 (comment)

i-am-neo · 2023-03-17T15:58:54Z

Websocket is an option, but would you be willing to pay whomever will host the backend?

LostRuins · 2023-03-18T16:39:53Z

Hi @henk717 I've gone ahead and created https://github.com/LostRuins/llamacpp-for-kobold which emulates a KoboldAI HTTP server, allowing it to be used as a custom API endpoint from within Kobold.

I wrote my own python ctypes bindings, and it requires zero other dependencies (no Flask, no Pybind11) except for llamalib.dll and Python itself. Windows binaries are included, but you can also rebuild the library from the makefile.

I also went ahead and added left square brackets to the banned tokens.

Unfortunately, it's not very ideal due to a fundamental flaw in llama.cpp where generation delay scales linearly with prompt length unlike on Huggingface Transformers. See this discussion for details.

avilum · 2023-03-19T20:39:08Z

Hey guys, if anyone is seeking for working client/server implementation;
I wrote a minimal realtime Go server and Python client with live inference streaming, that is based on this awesome repo.
See https://github.com/avilum/llama-saas

thomasantony · 2023-03-20T03:44:16Z

I have a proof of concept working with an existing web UI here:

oobabooga/text-generation-webui#447

It is very unpolished, but getting somewhere.

dranger003 · 2023-04-23T22:15:06Z

Hi there, I recently worked on C# bindings and a basic .NET core project. There are two sample projects included (CLI/Web + API). It could be easily be expanded with a more extensive JSON interface. Hope this is helpful.

https://github.com/dranger003/llama.cpp-dotnet

ggerganov added the need more info The OP should provide more details about the issue label Mar 13, 2023

dmahurin pushed a commit to dmahurin/llama.cpp that referenced this issue May 31, 2023

Add bindings for LoRA adapters. Closes ggerganov#88

bc7a6fa

dmahurin pushed a commit to dmahurin/llama.cpp that referenced this issue Jun 1, 2023

Add bindings for LoRA adapters. Closes ggerganov#88

b6ce513

ggerganov closed this as completed Jul 28, 2023

Deadsg pushed a commit to Deadsg/llama.cpp that referenced this issue Dec 19, 2023

Add bindings for LoRA adapters. Closes ggerganov#88

35abf89

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create json api service #88

Create json api service #88

wizd commented Mar 13, 2023

wizd commented Mar 13, 2023

henk717 commented Mar 14, 2023

MLTQ commented Mar 14, 2023 •

edited

Loading

i-am-neo commented Mar 17, 2023 •

edited

Loading

LostRuins commented Mar 18, 2023 •

edited

Loading

avilum commented Mar 19, 2023

thomasantony commented Mar 20, 2023

dranger003 commented Apr 23, 2023

Create json api service #88

Create json api service #88

Comments

wizd commented Mar 13, 2023

wizd commented Mar 13, 2023

henk717 commented Mar 14, 2023

MLTQ commented Mar 14, 2023 • edited Loading

i-am-neo commented Mar 17, 2023 • edited Loading

LostRuins commented Mar 18, 2023 • edited Loading

avilum commented Mar 19, 2023

thomasantony commented Mar 20, 2023

dranger003 commented Apr 23, 2023

MLTQ commented Mar 14, 2023 •

edited

Loading

i-am-neo commented Mar 17, 2023 •

edited

Loading

LostRuins commented Mar 18, 2023 •

edited

Loading