-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenAI-compatible API #20
Comments
With LLaMA 2 released, it even expects the whole "system, user, assistant" format now... |
Hi @LuciferianInk, The format is not obligatory but does improve the quality of the model. We'll try moving to the official format to achieve that. |
Support that, quite a bit of backends use OpenAI library, but non OpenAI backends (like Oobabooga's text gen, which has OpenAI API extension). This would help transition from using OAI/3rd party compatible backends a breeze. |
@apcameron @Eclipse-Station I agree that this feature would be useful. We'll try to find time to implement it - and pull requests are always welcome! |
Hey @borzunov @Eclipse-Station i'm confused - why do you need to mimic the openai I/O format in a local model? I don't think i saw this repo using openai at all? So what's the advantage |
Hi @krrishdholakia, This repo doesn't use OpenAI API in any sense, but using a similar interface would help with interoperability with existing software. E.g., one could take an existing chatbot/text generation UI supporting OpenAI API, then replace the API URL to make it work via the Petals swarm without any code changes. |
Oh! I think we might be able to help - https://github.com/BerriAI/litellm. Just created an issue to track this. Hope to get it done today.
|
@apcameron @borzunov We added the ability to call petals.dev using liteLLM OpenAI chatGPT Input/Output, check out this example notebook: |
@krrishdholakia @ishaan-jaff Thanks for making the integration! I think @apcameron and @Eclipse-Station want an HTTP API in the OpenAI-compatible format (= one URL) that internally translates API calls to the Petals HTTP/WebSocket API or directly to the Petals swarm. Can litellm help with that? In any case, we appreciate your work on making Petals available for litellm users! |
Yes we do not want to have to change the code of the applications we are using. The OpenAI API we are looking for needs to be transparent to the caller. |
@apcameron are you just using OpenAI + Petals? If you proxy openai, then you also need to deal with all the other openai requests (E.g. embeddings). But it seems like you just want to map the completion endpoint - correct? In that case wouldn't you want to basically remap |
Yes I think the competition endpoint would be a good start and the option to select the model |
Any updates on this so far? Would be great to be able to use Petals as a drop in replacement for anything using OpenAI's API |
Hi @apcameron @Eclipse-Station @jontstaz, Can you share a few examples of apps where OpenAI-compatible API for Petals will be helpful? We hired a part-time dev who may work on this - it would be great to know some apps where we can test this. @krrishdholakia, it seems like most people requesting this feature can't change the app code (e.g., to remap |
Hey @borzunov @jontstaz @apcameron we actually put out a solution for this - https://docs.litellm.ai/docs/proxy_server it's a 1-click local proxy, that spins up a local server to map openai completion calls to any litellm supported api (Petals, Huggingface TGI, TogetherAI, etc.) Here's the cli command
and it'll spin up an openai-compatiable proxy server at port: 8000. Just set the openai api base to this and it'll start making petals calls
|
@borzunov let me know if that covers the use-case, if not happy to iterate and land on something that works for the community! |
Here are some examples where it would be nice to use an OpenAI-compatible API to point to Petals instead. |
Thank I will try this out in the next few days when I get some time |
Added a tutorial for using the 1-click deploy with aider - https://docs.litellm.ai/docs/proxy_server#tutorial---using-with-aider can do this with petals as well by running this instead of the hf command
|
Perfect! This is exactly what I was looking for. Thanks. Also FYI there's a new model which apparently performs better than CodeLlama and all other previous code-focused models. It's https://huggingface.co/TheBloke/Phind-CodeLlama-34B-v2-GPTQ Would be cool to see it on Petals. I'm planning on getting a couple of 4090s relatively soon and then would be able to contribute to Petals with some code-focused models. |
Is it possible to provide an API the mimics the functionality of the OPENAI API?
The text was updated successfully, but these errors were encountered: