You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Expose generation timings from server & update completions.js (ggml-org#2116)
* use javascript generators as much cleaner API
Also add ways to access completion as promise and EventSource
* export llama_timings as struct and expose them in server
* update readme, update baked includes
* llama : uniform variable names + struct init
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Then you can utilize llama.cpp as an OpenAI's **chat.completion** or **text_completion** API
210
207
211
-
### Extending the Web Front End
208
+
### Extending or building alternative Web Front End
212
209
213
-
The default location for the static files is `examples/server/public`. You can extend the front end by running the server binary with `--path` set to `./your-directory` and importing `/completion.js` to get access to the llamaComplete() method. A simple example is below:
210
+
The default location for the static files is `examples/server/public`. You can extend the front end by running the server binary with `--path` set to `./your-directory` and importing `/completion.js` to get access to the llamaComplete() method.
214
211
215
-
```
212
+
Read the documentation in `/completion.js` to see convenient ways to access llama.
213
+
214
+
A simple example is below:
215
+
216
+
```html
216
217
<html>
217
218
<body>
218
219
<pre>
219
220
<scripttype="module">
220
-
import { llamaComplete } from '/completion.js'
221
-
222
-
llamaComplete({
223
-
prompt: "### Instruction:\nWrite dad jokes, each one paragraph. You can use html formatting if needed.\n\n### Response:",
0 commit comments