[Feature request] Add simple HTTP API server like in llama.cpp with api like OpenAI #1

pythops · 2024-02-21T13:51:39Z

For more infos here
https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md

austinvhuang · 2024-02-21T15:44:58Z

Great suggestion, if there's others who interested please +emoji above and we'll prioritize this :)

pythops · 2024-02-21T19:27:08Z

Just for the update: llama.cpp added support for gemma models
ggerganov/llama.cpp#5631

loretoparisi · 2024-02-21T23:20:57Z

Just for the update: llama.cpp added support for gemma models

ggerganov/llama.cpp#5631

Also with 💎Gemma in 🦙Llama.CPP you get CUDA, Neon and AMD GPUs support!
And - in theory - running into the browser if you can compile to WASM.

omkar806 · 2024-04-19T04:59:35Z

adding a api like support would be great these models can be used on cpu for smaller tasks.
+1 for this.

zeerd · 2024-04-24T05:59:20Z

I have a question: why using http but not websocket?

As I known, the answer token is generated one word by one word.
And, seems, http has no function to do multi-responses for one call.
Which means , http need to gather the whole answer before trans it back.

ufownl · 2024-04-26T09:39:58Z

I have a question: why using http but not websocket?

As I known, the answer token is generated one word by one word. And, seems, http has no function to do multi-responses for one call. Which means , http need to gather the whole answer before trans it back.

WebSocket is more suitable for instant messenger style UI but may not be ideal for other UI types. And I think it is better to integrate gemma.cpp as a module into the web backend framework than to implement the HTTP/WebSocket API directly.

Here is my WebSocket online demo solution, and you can try it here or via this Kaggle notebook. In this solution gemma.cpp is a module of OpenResty which makes it easy to implement WebSocket or HTTP API.

Gopi-Uppari · 2024-10-16T05:22:08Z

Could you please confirm if this issue is resolved for you with the above comment ? Please feel free to close the issue if it is resolved ?

Thank you.

leszko7 · 2024-10-16T17:13:37Z

ok app

Zeenat30 · 2024-10-17T18:08:54Z

Yes it is

…

On Wed, 16 Oct 2024, 19:14 Leszek, ***@***.***> wrote: ok app — Reply to this email directly, view it on GitHub <#1 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AFCLH3EJX5QCVDE3KMMDK53Z32NFVAVCNFSM6AAAAABQATJCOOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMJXGQZTSMRZG4> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

Zeenat30 · 2024-10-17T18:09:10Z

Thank you

…

On Thu, 17 Oct 2024, 20:08 Zeenat Randeree, ***@***.***> wrote: Yes it is On Wed, 16 Oct 2024, 19:14 Leszek, ***@***.***> wrote: > ok app > > — > Reply to this email directly, view it on GitHub > <#1 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AFCLH3EJX5QCVDE3KMMDK53Z32NFVAVCNFSM6AAAAABQATJCOOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMJXGQZTSMRZG4> > . > You are receiving this because you are subscribed to this thread.Message > ID: ***@***.***> >

Gopi-Uppari · 2024-10-21T06:12:50Z

Closing this issue, please feel free reopen if this is still a valid request. Thank you!

austinvhuang added the Feature New feature or request label Feb 24, 2024

zeerd mentioned this issue Apr 24, 2024

WIP: Run gemma via websocket #157

Closed

KumarGitesh2024 added the type:support Support issues label Jun 5, 2024

KumarGitesh2024 assigned KumarGitesh2024 and jan-wassenberg and unassigned KumarGitesh2024 Jun 7, 2024

Gopi-Uppari closed this as completed Oct 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature request] Add simple HTTP API server like in llama.cpp with api like OpenAI #1

[Feature request] Add simple HTTP API server like in llama.cpp with api like OpenAI #1

pythops commented Feb 21, 2024

austinvhuang commented Feb 21, 2024

pythops commented Feb 21, 2024

loretoparisi commented Feb 21, 2024 •

edited

Loading

omkar806 commented Apr 19, 2024

zeerd commented Apr 24, 2024

ufownl commented Apr 26, 2024 •

edited

Loading

Gopi-Uppari commented Oct 16, 2024

leszko7 commented Oct 16, 2024

Zeenat30 commented Oct 17, 2024 via email

Zeenat30 commented Oct 17, 2024 via email

Gopi-Uppari commented Oct 21, 2024

[Feature request] Add simple HTTP API server like in llama.cpp with api like OpenAI #1

[Feature request] Add simple HTTP API server like in llama.cpp with api like OpenAI #1

Comments

pythops commented Feb 21, 2024

austinvhuang commented Feb 21, 2024

pythops commented Feb 21, 2024

loretoparisi commented Feb 21, 2024 • edited Loading

omkar806 commented Apr 19, 2024

zeerd commented Apr 24, 2024

ufownl commented Apr 26, 2024 • edited Loading

Gopi-Uppari commented Oct 16, 2024

leszko7 commented Oct 16, 2024

Zeenat30 commented Oct 17, 2024 via email

Zeenat30 commented Oct 17, 2024 via email

Gopi-Uppari commented Oct 21, 2024

loretoparisi commented Feb 21, 2024 •

edited

Loading

ufownl commented Apr 26, 2024 •

edited

Loading