-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Super Simple Whisper Server #1380
Conversation
Thanks for sharing @felrock |
also some clean up
Hey @ggerganov, I would like to add support to the OpenAI API request as you mentioned in the issue ticket. file I've also added a few more of whispers params as request parameters, for example duration and offset. I have some time during the weekend to implement this. Please add any comments if you thing i should do something differently. Thanks! |
@felrock Awesome - let's do this!
Maybe in the future.
json is fine. I guess adding at least plain text format would also be nice since in some application, parsing json is not always an option. And later we can extend to support more formats if necessary |
Also some general clean up
Now the post request to /inference should support the format that was specified by OpenAI. Except the parameters such as model(since we are using a load method instead) and no AUTH tokens are needed. Output can be in json or in text format. It's now also possible to load a new model using the /load post function, passing a file path to a model on the machine that runs the server. Edit: I forgot to mention i added some simple error messages also on the api responses. Request examples:
/load
Let me know what you think @ggerganov. |
I'm trying: curl 127.0.0.1:8080/inference -H "Content-Type: multipart/form-data" -F file="./samples/jfk.wav" -F temperature="0.2" -F response-format="json" And I get: whisper server listening at http://127.0.0.1:8080
Received request:
error: failed to open '' as WAV file
error: failed to read WAV file '' Any ideas? Looks like:
Is this expected? I'm no macOS Edit: Ok got it. I have to use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Apologies for the force-push just now. I think we should be ready to merge |
No worries, sorry for not updating the Makefile! |
Hi I have tried this with the release artifacts and I keep getting the following error (including the working main.exe test as well). Running on Windows11:
Not sure what I'm doing wrong. Thanks!! Also tried with curl 127.0.0.1:8080/inference -H "Content-Type: multipart/form-data" -F file=@./jfk.wav -F temperature="0.2" -F response-format="json" |
I think it's because of the temp file that is written is not being closed. I'll add a follow up commit later today closing the ofstream and I think it will solve the issue. |
Thank you, also please don't rule out my incompetence. So if there's something I might be doing wrong as well feel free to let me know. I don't run curl commands typically but when my normal python request code failed to the server I wanted to try to replicate the command above as best as possible. |
No worries! Try this out and please let me know if it works after, your curl syntax looks correct. |
curl worked now correctly!!! Thank you!!!
|
ehhhh sorry. One new "problem." This version deletes the jfk.wav when it's done with it. Noticed when I ran command again "just because" that directory no longer has jfk.wav in it. |
Great! Haha well that isn't good, I took for granted that the file wouldn't exists in the current working directory of the server, but if it is it will remove it. Maybe if we just add some random temp name to it instead we wouldn't have this problem. |
ohhhhh ok. well see sometimes you need an idiot to test to do something that no one else would do, I stuck the file in there because I was having issues with paths and such and wanted to just eliminate as many variables as possible of what might be going wrong. That's probably not something most users would do. :) Thanks again, this is such a game changer having this feature (hence why I'm testing it so quickly lol)! EDIT: Can 100% confirm that if you're a normal person and have the file in samples/file.wav all works well and doesn't delete the file so I would not consider this a high priority at all. |
Haha it's great that you find bugs I would of never found! Added the changes here #1535 |
You're quick thank you! By the way, blown away by performance of server + cublas + distilwhisper large v2 ggml right now. Its near instant. |
Is there any support for streaming realtime speech to text? |
Not by default no, and also it depends on what you mean with realtime really. If you add some logic in the client side you can make it realtime. |
This server is very useful! Happy to see it added. Would it be possible to add an endpoint that returns the name of the currently loaded model? I'd like my client to be able to check the model without instructing the server which one to |
…ov#1380) * Add first draft of server * Added json support and base funcs for server.cpp * Add more user input via api-request also some clean up * Add reqest params and load post function Also some general clean up * Remove unused function * Add readme * Add exception handlers * Update examples/server/server.cpp * make : add server target * Add magic curl syntax Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Just built and ran Whisper server to find it conflicts with the same default port of a running Llama server. (using both defaults at once is therefore blocked) Could possibly offset Whisper's default port by a small amount to save initial adjustment. |
Both are good ideas! PRs welcome |
I don't know if returning the model with every transcription is desired as default behavior or not, so I created an issue to discuss instead, which shows how I implemented it for my use case. |
@felrock Works like a dream.... I am not getting timestamps... I see in ./server you can disable them but how can I enable them? Example:``` time curl 192.168.1.205:8080/inference -H "Content-Type: multipart/form-data" -F temperature="0.1" -F response-format="json" -F file=@/unique_audio/28d99065444b8f27d8a59372ae87232f.wav
|
Nice that you like it! Perhaps try out the |
:-) CHEERS! |
…ov#1380) * Add first draft of server * Added json support and base funcs for server.cpp * Add more user input via api-request also some clean up * Add reqest params and load post function Also some general clean up * Remove unused function * Add readme * Add exception handlers * Update examples/server/server.cpp * make : add server target * Add magic curl syntax Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
I made this because i need it for a project, but if people would like to also have it I could extend it a bit and clean it up a bit for you guys.