Name	Name	Last commit message	Last commit date
Latest commit ishaan-jaff (fea) ui - see delete confirmation before deleting Feb 9, 2024 5e87932 · Feb 9, 2024 History 7,127 Commits
.circleci	.circleci	Merge pull request #1829 from BerriAI/litellm_add_semantic_cache	Feb 6, 2024
.github	.github	Merge pull request #1602 from ShaunMaher/add_helm_chart	Feb 5, 2024
ci_cd	ci_cd	(fix) pre commit hook to sync backup context_window mapping	Feb 5, 2024
cookbook	cookbook	(cookbook) load test litellm router	Feb 8, 2024
deploy/charts/litellm-helm	deploy/charts/litellm-helm	Authored a Helm chart for LiteLLM. Added GitHub workflows/actions to …	Jan 25, 2024
dist	dist	fix: syncing changes	Jan 12, 2024
docker	docker	Revert "build(Dockerfile): move prisma build to dockerfile"	Jan 6, 2024
docs/my-website	docs/my-website	Add support for AWS credentials from profile file	Feb 8, 2024
litellm	litellm	Merge pull request #1898 from BerriAI/litellm_langfuse_error_logging	Feb 9, 2024
tests	tests	fix(proxy_cli.py-&&-proxy_server.py): bump reset budget intervals and…	Feb 7, 2024
ui	ui	(fea) ui - see delete confirmation before deleting	Feb 9, 2024
.env.example	.env.example	feat: added support for OPENAI_API_BASE	Aug 28, 2023
.flake8	.flake8	chore: list all ignored flake8 rules explicit	Dec 23, 2023
.gitattributes	.gitattributes	ignore ipynbs	Aug 31, 2023
.gitignore	.gitignore	fix(utils.py): support together ai function calling	Feb 5, 2024
.pre-commit-config.yaml	.pre-commit-config.yaml	(feat) add pre-commit hook to check model_prices_and_context_window.j…	Feb 5, 2024
Dockerfile	Dockerfile	(fix) dockerfile for semantic caching	Feb 7, 2024
Dockerfile.alpine	Dockerfile.alpine	(fix) alpine Docker image	Jan 10, 2024
Dockerfile.database	Dockerfile.database	(fix) dockerfile for semantic caching	Feb 7, 2024
LICENSE	LICENSE	Initial commit	Jul 27, 2023
README.md	README.md	Update README.md	Feb 8, 2024
docker-compose.yml	docker-compose.yml	(ci/cd) docker compose up with ui	Jan 26, 2024
entrypoint.sh	entrypoint.sh	(ci/cd) set litellm as entrypoint	Jan 10, 2024
model_prices_and_context_window.json	model_prices_and_context_window.json	(feat) add azure/gpt-4-0125-preview	Feb 8, 2024
mypy.ini	mypy.ini	fix(google_kms.py): support enums for key management system	Dec 27, 2023
poetry.lock	poetry.lock	(chore) bump poetry lock	Jan 26, 2024
proxy_server_config.yaml	proxy_server_config.yaml	feat(utils.py): support cost tracking for openai/azure image gen models	Feb 4, 2024
pyproject.toml	pyproject.toml	bump: version 1.23.2 → 1.23.3	Feb 8, 2024
requirements.txt	requirements.txt	(fix) redisvl requirements.txt issue	Feb 6, 2024
retry_push.sh	retry_push.sh	build(Dockerfile): moves prisma logic to dockerfile	Jan 6, 2024
schema.prisma	schema.prisma	build(schema.prisma): support direct url on prisma schema	Feb 9, 2024
template.yaml	template.yaml	Use -function for naming.	Nov 23, 2023

Name

Last commit message

Last commit date

ishaan-jaff

(fea) ui - see delete confirmation before deleting

Feb 9, 2024

5e87932 · Feb 9, 2024

7,127 Commits

.circleci

Merge pull request #1829 from BerriAI/litellm_add_semantic_cache

Feb 6, 2024

.github

Merge pull request #1602 from ShaunMaher/add_helm_chart

Feb 5, 2024

ci_cd

(fix) pre commit hook to sync backup context_window mapping

Feb 5, 2024

cookbook

(cookbook) load test litellm router

Feb 8, 2024

deploy/charts/litellm-helm

Authored a Helm chart for LiteLLM. Added GitHub workflows/actions to …

Jan 25, 2024

dist

fix: syncing changes

Jan 12, 2024

docker

Revert "build(Dockerfile): move prisma build to dockerfile"

Jan 6, 2024

docs/my-website

Add support for AWS credentials from profile file

Feb 8, 2024

litellm

Merge pull request #1898 from BerriAI/litellm_langfuse_error_logging

Feb 9, 2024

tests

fix(proxy_cli.py-&&-proxy_server.py): bump reset budget intervals and…

Feb 7, 2024

(fea) ui - see delete confirmation before deleting

Feb 9, 2024

.env.example

feat: added support for OPENAI_API_BASE

Aug 28, 2023

.flake8

chore: list all ignored flake8 rules explicit

Dec 23, 2023

.gitattributes

ignore ipynbs

Aug 31, 2023

.gitignore

fix(utils.py): support together ai function calling

Feb 5, 2024

.pre-commit-config.yaml

(feat) add pre-commit hook to check model_prices_and_context_window.j…

Feb 5, 2024

Dockerfile

(fix) dockerfile for semantic caching

Feb 7, 2024

Dockerfile.alpine

(fix) alpine Docker image

Jan 10, 2024

Dockerfile.database

(fix) dockerfile for semantic caching

Feb 7, 2024

Jul 27, 2023

Feb 8, 2024

(ci/cd) docker compose up with ui

Jan 26, 2024

entrypoint.sh

(ci/cd) set litellm as entrypoint

Jan 10, 2024

model_prices_and_context_window.json

(feat) add azure/gpt-4-0125-preview

Feb 8, 2024

mypy.ini

fix(google_kms.py): support enums for key management system

Dec 27, 2023

poetry.lock

(chore) bump poetry lock

Jan 26, 2024

proxy_server_config.yaml

feat(utils.py): support cost tracking for openai/azure image gen models

Feb 4, 2024

pyproject.toml

bump: version 1.23.2 → 1.23.3

Feb 8, 2024

requirements.txt

(fix) redisvl requirements.txt issue

Feb 6, 2024

retry_push.sh

build(Dockerfile): moves prisma logic to dockerfile

Jan 6, 2024

schema.prisma

build(schema.prisma): support direct url on prisma schema

Feb 9, 2024

template.yaml

Use -function for naming.

Nov 23, 2023

🚅 LiteLLM

Call all LLM APIs using the OpenAI format [Bedrock, Huggingface, VertexAI, TogetherAI, Azure, OpenAI, etc.]

OpenAI Proxy Server | Enterprise Support

LiteLLM manages:

Translate inputs to provider's completion, embedding, and image_generation endpoints
Consistent output, text responses will always be available at ['choices'][0]['message']['content']
Retry/fallback logic across multiple deployments (e.g. Azure/OpenAI) - Router

Jump to OpenAI Proxy Docs
Jump to Supported LLM Providers

Usage (Docs)

Important

LiteLLM v1.0.0 now requires openai>=1.0.0. Migration guide here

pip install litellm

from litellm import completion
import os

## set ENV variables 
os.environ["OPENAI_API_KEY"] = "your-openai-key" 
os.environ["COHERE_API_KEY"] = "your-cohere-key" 

messages = [{ "content": "Hello, how are you?","role": "user"}]

# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)

# cohere call
response = completion(model="command-nightly", messages=messages)
print(response)

Async (Docs)

from litellm import acompletion
import asyncio

async def test_get_response():
    user_message = "Hello, how are you?"
    messages = [{"content": user_message, "role": "user"}]
    response = await acompletion(model="gpt-3.5-turbo", messages=messages)
    return response

response = asyncio.run(test_get_response())
print(response)

Streaming (Docs)

liteLLM supports streaming the model response back, pass stream=True to get a streaming iterator in response.
Streaming is supported for all models (Bedrock, Huggingface, TogetherAI, Azure, OpenAI, etc.)

from litellm import completion
response = completion(model="gpt-3.5-turbo", messages=messages, stream=True)
for part in response:
    print(part.choices[0].delta.content or "")

# claude 2
response = completion('claude-2', messages, stream=True)
for part in response:
    print(part.choices[0].delta.content or "")

Logging Observability (Docs)

LiteLLM exposes pre defined callbacks to send data to Langfuse, DynamoDB, s3 Buckets, LLMonitor, Helicone, Promptlayer, Traceloop, Slack

from litellm import completion

## set env variables for logging tools
os.environ["LANGFUSE_PUBLIC_KEY"] = ""
os.environ["LANGFUSE_SECRET_KEY"] = ""
os.environ["LLMONITOR_APP_ID"] = "your-llmonitor-app-id"

os.environ["OPENAI_API_KEY"]

# set callbacks
litellm.success_callback = ["langfuse", "llmonitor"] # log input/output to langfuse, llmonitor, supabase

#openai call
response = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Hi 👋 - i'm openai"}])

OpenAI Proxy - (Docs)

Track spend across multiple projects/people

The proxy provides:

📖 Proxy Endpoints - Swagger Docs

Quick Start Proxy - CLI

pip install 'litellm[proxy]'

Step 1: Start litellm proxy

$ litellm --model huggingface/bigcode/starcoder

#INFO: Proxy running on http://0.0.0.0:8000

Step 2: Make ChatCompletions Request to Proxy

import openai # openai v1.0.0+
client = openai.OpenAI(api_key="anything",base_url="http://0.0.0.0:8000") # set proxy to base_url
# request sent to model set on litellm proxy, `litellm --model`
response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [
    {
        "role": "user",
        "content": "this is a test request, write a short poem"
    }
])

print(response)

Proxy Key Management (Docs)

Track Spend, Set budgets and create virtual keys for the proxy POST /key/generate

Request

curl 'http://0.0.0.0:8000/key/generate' \
--header 'Authorization: Bearer sk-1234' \
--header 'Content-Type: application/json' \
--data-raw '{"models": ["gpt-3.5-turbo", "gpt-4", "claude-2"], "duration": "20m","metadata": {"user": "ishaan@berri.ai", "team": "core-infra"}}'

Expected Response

{
    "key": "sk-kdEXbIqZRwEeEiHwdg7sFA", # Bearer token
    "expires": "2023-11-19T01:38:25.838000+00:00" # datetime object
}

[Beta] Proxy UI (Docs)

A UI to create keys, track spend per key

Code: https://github.com/BerriAI/litellm/tree/main/ui

Supported Providers (Docs)

Provider	Completion	Streaming	Async Completion	Async Streaming	Async Embedding	Async Image Generation
openai	✅	✅	✅	✅	✅	✅
azure	✅	✅	✅	✅	✅	✅
aws - sagemaker	✅	✅	✅	✅	✅
aws - bedrock	✅	✅	✅	✅	✅
google - vertex_ai [Gemini]	✅	✅	✅	✅
google - palm	✅	✅	✅	✅
google AI Studio - gemini	✅		✅
mistral ai api	✅	✅	✅	✅	✅
cloudflare AI Workers	✅	✅	✅	✅
cohere	✅	✅	✅	✅	✅
anthropic	✅	✅	✅	✅
huggingface	✅	✅	✅	✅	✅
replicate	✅	✅	✅	✅
together_ai	✅	✅	✅	✅
openrouter	✅	✅	✅	✅
ai21	✅	✅	✅	✅
baseten	✅	✅	✅	✅
vllm	✅	✅	✅	✅
nlp_cloud	✅	✅	✅	✅
aleph alpha	✅	✅	✅	✅
petals	✅	✅	✅	✅
ollama	✅	✅	✅	✅
deepinfra	✅	✅	✅	✅
perplexity-ai	✅	✅	✅	✅
anyscale	✅	✅	✅	✅
voyage ai					✅
xinference [Xorbits Inference]					✅

Read the Docs

Contributing

To contribute: Clone the repo locally -> Make a change -> Submit a PR with the change.

Here's how to modify the repo locally: Step 1: Clone the repo

git clone https://github.com/BerriAI/litellm.git

Step 2: Navigate into the project, and install dependencies:

cd litellm
poetry install

Step 3: Test your change:

cd litellm/tests # pwd: Documents/litellm/litellm/tests
poetry run flake8
poetry run pytest .

Step 4: Submit a PR with your changes! 🚀

push your fork to your GitHub repo
submit a PR from there

Support / talk with founders

Schedule Demo 👋
Community Discord 💭
Our numbers 📞 +1 (770) 8783-106 / ‭+1 (412) 618-6238‬
Our emails ✉️ ishaan@berri.ai / krrish@berri.ai

Why did we build this

Need for simplicity: Our code started to get extremely complicated managing & translating calls between Azure, OpenAI and Cohere.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚅 LiteLLM

OpenAI Proxy Server | Enterprise Support

Usage (Docs)

Async (Docs)

Streaming (Docs)

Logging Observability (Docs)

OpenAI Proxy - (Docs)

📖 Proxy Endpoints - Swagger Docs

Quick Start Proxy - CLI

Step 1: Start litellm proxy

Step 2: Make ChatCompletions Request to Proxy

Proxy Key Management (Docs)

Request

Expected Response

[Beta] Proxy UI (Docs)

Supported Providers (Docs)

Contributing

Support / talk with founders

Why did we build this

Contributors

About

Releases 677

Sponsor this project

Packages 5

Used by 4.7k

Contributors 365

Languages

License

BerriAI/litellm

Folders and files

Latest commit

History

Repository files navigation

🚅 LiteLLM

OpenAI Proxy Server | Enterprise Support

Usage (Docs)

Async (Docs)

Streaming (Docs)

Logging Observability (Docs)

OpenAI Proxy - (Docs)

📖 Proxy Endpoints - Swagger Docs

Quick Start Proxy - CLI

Step 1: Start litellm proxy

Step 2: Make ChatCompletions Request to Proxy

Proxy Key Management (Docs)

Request

Expected Response

[Beta] Proxy UI (Docs)

Supported Providers (Docs)

Contributing

Support / talk with founders

Why did we build this

Contributors

About

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases 677

Sponsor this project

Packages 5

Used by 4.7k

Contributors 365

Languages