-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update the Dockerfile of the LiteLLM Proxy server and some refactorings #628
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
|
||
RUN python proxy_cli.py --config -f /app/secrets_template.toml | ||
|
||
RUN python proxy_cli.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't RUN python proxy_cli.py
as it will cause the docker build process to get blocked.
@@ -7,7 +7,6 @@ | |||
import operator | |||
|
|||
config_filename = "litellm.secrets.toml" | |||
pkg_config_filename = "template.secrets.toml" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable is not being used.
except Exception as e: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Errors that occur during the loading configuration phase should not be swallowed but should be thrown to let the process exit fast.
@router.post("/v1/completions") | ||
@router.post("/completions") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic of the two routes is similar, one of which gives default parameters for litellm_completion and the other does not. It's confusing. I've combined them into one function here
@router.post("/v1/chat/completions") | ||
async def v1_chat_completion(request: Request): | ||
@router.post("/chat/completions") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
backoff | ||
boto3 | ||
uvicorn | ||
fastapi | ||
tomli | ||
appdirs | ||
tomli-w |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The requirements file looks like it's for docker images, and I want our docker images to contain as many packages as possible so that users don't need to repackage images to install new dependencies
# add_function_to_prompt = true # e.g: Ollama doesn't support functions, so add it to the prompt instead | ||
# drop_params = true # drop any params not supported by the provider (e.g. Ollama) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Toml, True is the wrong way to write it
@coconut49 lgtm! |
Your changes to requirements.txt are interesting - presumably this is to deal with the pip install we run only for proxy-specific packages. @coconut49 You mentioned not wanting your users to deal with this. Can you chat for 10 minutes this week? I'm trying to learn about prod/proxy use-cases. |
Hi @krrishdholakia |
For example, TypingMind, which I'm using, allows me to add as many LLMs as I want that conform to the OpenAI API format. |
Description
The purpose of this PR is to fix a number of issues encountered when I trying to deploy LiteLLM Proxy in docker and to do some refactoring of the code.
Tests
I tested the chat completion feature of OpenAI, Azure OpenAI, Bedrock and OpenRouter in my environment, and it worked.