Releases · BerriAI/litellm

09 Mar 22:16

github-actions

v1.30.6

e10991e

v1.30.6

What's Changed

[Docs] Deploying litellm - litellm, litellm-database, litellm with redis by @ishaan-jaff in #2423
feat(helm-chart): redis as cache managed by chart by @debdutdeb in #2420

New Contributors

@debdutdeb made their first contribution in #2420

Full Changelog: v1.30.5...v1.30.6

Contributors

ishaan-jaff and debdutdeb

Assets 2

09 Mar 08:05

github-actions

v1.30.5

d8a6b82

v1.30.5

What's Changed

feat(main.py): support openai transcription endpoints by @krrishdholakia in #2401
load balancing transcription endpoints by @krrishdholakia in #2405

Full Changelog: v1.30.4...v1.30.5

Contributors

krrishdholakia

Assets 2

09 Mar 06:29

github-actions

v1.30.4

eb53136

v1.30.4

1.Incognito Requests - Don't log anything - docs: https://docs.litellm.ai/docs/proxy/enterprise#incognito-requests---dont-log-anything

When no-log=True, the request will not be logged on any callbacks and there will be no server logs on litellm

import openai
client = openai.OpenAI(
    api_key="anything",            # proxy api-key
    base_url="http://0.0.0.0:8000" # litellm proxy 
)

response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages = [
        {
            "role": "user",
            "content": "this is a test request, write a short poem"
        }
    ],
    extra_body={
        "no-log": True
    }
)

print(response)

2. Allow user to pass messages.name for claude-3, perplexity

Note: Before this pr - the two providers would raise errors with the name param

LiteLLM SDK

import litellm
response = litellm.completion(
  model="claude-3-opus-20240229", 
  messages = [
    {"role": "user", "content": "Hi gm!", "name": "ishaan"},
   ]
)

LiteLLM Proxy Server

import openai
client = openai.OpenAI(
    api_key="anything",
    base_url="http://0.0.0.0:8000"
)

response = client.chat.completions.create(
model="claude-3-opus-20240229"", 
messages = [
    {"role": "user", "content": "Hi gm!", "name": "ishaan"},
])

print(response)

3. If user is using `run_gunicorn` use cpu_count to select optimal `num_workers`

4. AzureOpenAI - Pass api_version to litellm proxy per request

Usage - sending a request to litellm proxy

from openai import AzureOpenAI

client = AzureOpenAI(
    api_key="dummy",
    # I want to use a specific api_version, other than default 2023-07-01-preview
    api_version="2023-05-15",
    # OpenAI Proxy Endpoint
    azure_endpoint="https://openai-proxy.domain.com"
    )

response = client.chat.completions.create(
    model="gpt-35-turbo-16k-qt",
    messages=[
        {"role": "user", "content": "Some content"}
    ],
)

What's Changed

[Feat] Support messages.name for claude-3, perplexity ai API by @ishaan-jaff in #2399
docs: fix yaml typo in proxy/configs.md by @GuillermoBlasco in #2402
[Feat] LiteLLM - use cpu_count for default num_workers, run locust load test by @ishaan-jaff in #2406
[FEAT] AzureOpenAI - Pass api_version to litellm per request by @ishaan-jaff in #2403
Add quickstart deploy with k8s by @GuillermoBlasco in #2409
Update Docs for Kubernetes by @H0llyW00dzZ in #2411
[FEAT-liteLLM Proxy] Incognito Requests - Don't log anything by @ishaan-jaff in #2408
Fix Docs Formatting in Website by @H0llyW00dzZ in #2413

New Contributors

@GuillermoBlasco made their first contribution in #2402
@H0llyW00dzZ made their first contribution in #2411

Full Changelog: v1.30.3...v1.30.4

Contributors

GuillermoBlasco, H0llyW00dzZ, and ishaan-jaff

Assets 2

08 Mar 16:41

github-actions

v1.30.3

f8f01e5

v1.30.3

Full Changelog: v1.30.2...v1.30.3

Assets 2

08 Mar 05:14

github-actions

v1.30.2

b4e12fb

v1.30.2

🚀 LiteLLM Proxy - Proxy 100+ LLMs, Set Budgets and Auto-Scale with the LiteLLM CloudFormation Stack 👉Start here: https://docs.litellm.ai/docs/proxy/deploy#aws-cloud-formation-stack

⚡️ Load Balancing - View Metrics about selected deployments in server logs

🔎 Proxy view better debug prisma logs / slack alerts

📖 Docs: setting load balancing config https://docs.litellm.ai/docs/proxy/configs

⭐️ PR for using cross account ARN with Bedrock, Sagemaker: #2179

https://github.com/BerriAI/litellm/releases/tag/v1.30.2

What's Changed

test: reintegrate s3 testing by @krrishdholakia in #2386
(docs) setting load balancing config by @ishaan-jaff in #2388
feat: add realease details to discord notification message by @DanielChico in #2387
[FIX] Proxy better debug prisma logs by @ishaan-jaff in #2390
[Feat] Load Balancing - View Metrics about selected deployments in server logs by @ishaan-jaff in #2393
(feat) LiteLLM AWS CloudFormation Stack Template by @ishaan-jaff in #2391

Full Changelog: v1.30.1...v1.30.2

Contributors

krrishdholakia, ishaan-jaff, and DanielChico

Assets 2

07 Mar 17:07

github-actions

v1.30.1

a89d0db

v1.30.1

docs(team_based_routing.md): add docs on team based routing by @krrishdholakia
fix(proxy_server.py): fix model alias map + add back testing by @krrishdholakia

Full Changelog: v1.30.0...v1.30.1

Contributors

krrishdholakia

Assets 2

07 Mar 06:31

github-actions

v1.30.0

b9854a9

v1.30.0

What's Changed

feat(proxy_server.py): team based model aliases by @krrishdholakia in #2377

Full Changelog: v1.29.7...v1.30.0

Contributors

krrishdholakia

Assets 2

07 Mar 05:02

github-actions

v1.29.7

2e130fb

v1.29.7

⚡️LiteLLM Proxy 100+ LLMs, Track Number of Requests, Avg Latency Per Model Deployment

🛠️ High Traffic Fixes - Fix for DB connection limit hits when model fallbacks occur

🚀 High Traffic Fixes - /embedding - bug "Dictionary changed size during iteration"

⚡️ High Traffic Fixes - Switch off --detailed_debug in default Dockerfile. Users will need to opt in to viewing --detailed_debug logs. (This led to a 5% decrease in avg latency across 1K concurrent calls)

📖 Docs - Fixes for /user/new on LiteLLM Proxy Swagger (show how to set tpm/rpm limits per user) https://docs.litellm.ai/docs/proxy/virtual_keys#usernew

⭐️ Admin UI - separate latency, num requests graphs for model deployments https://docs.litellm.ai/docs/proxy/ui

What's Changed

(Fix) High Traffic Fix - handle litellm circular ref error by @ishaan-jaff in #2363
(feat) admin UI show model avg latency, num requests by @ishaan-jaff in #2367
(fix) admin UI swagger by @ishaan-jaff in #2371
[FIX] 🐛 embedding - "Dictionary changed size during iteration" Debug Log by @ishaan-jaff in #2378
[Fix] Switch off detailed_debug in default docker by @ishaan-jaff in #2375
feat(proxy_server.py): retry if virtual key is rate limited by @krrishdholakia in #2347
fix(caching.py): add s3 path as a top-level param by @krrishdholakia in #2379

Full Changelog: v1.29.4...v1.29.7

Contributors

krrishdholakia and ishaan-jaff

Assets 2

06 Mar 05:23

github-actions

v1.29.5

656e300

v1.29.5

What's Changed

[Fix] Fix Batch Updating User DB by @ishaan-jaff in #2340
[FIX] pydantic warnings by @ishaan-jaff in #2343
(fix) bug importing litellm on python 3.8 by @ishaan-jaff in #2344
[Fix] Better debugging with alerts by @ishaan-jaff in #2341
(fix) Update Team DB by @ishaan-jaff in #2345

Full Changelog: v1.29.3...v1.29.5

Contributors

ishaan-jaff

Assets 2

06 Mar 04:56

github-actions

v1.29.4

656e300

v1.29.4

What's Changed

[Fix] Fix Batch Updating User DB by @ishaan-jaff in #2340
[FIX] pydantic warnings by @ishaan-jaff in #2343
(fix) bug importing litellm on python 3.8 by @ishaan-jaff in #2344
[Fix] Better debugging with alerts by @ishaan-jaff in #2341
(fix) Update Team DB by @ishaan-jaff in #2345

Full Changelog: v1.29.3...v1.29.4

Contributors

ishaan-jaff

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

New Contributors

Contributors

What's Changed

Contributors

1.Incognito Requests - Don't log anything - docs: https://docs.litellm.ai/docs/proxy/enterprise#incognito-requests---dont-log-anything

2. Allow user to pass messages.name for claude-3, perplexity

LiteLLM SDK

LiteLLM Proxy Server

3. If user is using `run_gunicorn` use cpu_count to select optimal `num_workers`

4. AzureOpenAI - Pass api_version to litellm proxy per request

Usage - sending a request to litellm proxy

What's Changed

New Contributors

Contributors

🚀 LiteLLM Proxy - Proxy 100+ LLMs, Set Budgets and Auto-Scale with the LiteLLM CloudFormation Stack 👉Start here: https://docs.litellm.ai/docs/proxy/deploy#aws-cloud-formation-stack

What's Changed

Contributors

Contributors

What's Changed

Contributors

⚡️LiteLLM Proxy 100+ LLMs, Track Number of Requests, Avg Latency Per Model Deployment

🛠️ High Traffic Fixes - Fix for DB connection limit hits when model fallbacks occur

🚀 High Traffic Fixes - /embedding - bug "Dictionary changed size during iteration"

⚡️ High Traffic Fixes - Switch off --detailed_debug in default Dockerfile. Users will need to opt in to viewing --detailed_debug logs. (This led to a 5% decrease in avg latency across 1K concurrent calls)

📖 Docs - Fixes for /user/new on LiteLLM Proxy Swagger (show how to set tpm/rpm limits per user) https://docs.litellm.ai/docs/proxy/virtual_keys#usernew

⭐️ Admin UI - separate latency, num requests graphs for model deployments https://docs.litellm.ai/docs/proxy/ui

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

Releases: BerriAI/litellm

v1.30.6

What's Changed

New Contributors

Contributors

v1.30.5

What's Changed

Contributors

v1.30.4

1.Incognito Requests - Don't log anything - docs: https://docs.litellm.ai/docs/proxy/enterprise#incognito-requests---dont-log-anything

2. Allow user to pass messages.name for claude-3, perplexity

LiteLLM SDK

LiteLLM Proxy Server

3. If user is using run_gunicorn use cpu_count to select optimal num_workers

4. AzureOpenAI - Pass api_version to litellm proxy per request

Usage - sending a request to litellm proxy

What's Changed

New Contributors

Contributors

v1.30.3

v1.30.2

🚀 LiteLLM Proxy - Proxy 100+ LLMs, Set Budgets and Auto-Scale with the LiteLLM CloudFormation Stack 👉Start here: https://docs.litellm.ai/docs/proxy/deploy#aws-cloud-formation-stack

What's Changed

Contributors

v1.30.1

Contributors

v1.30.0

What's Changed

Contributors

v1.29.7

⚡️LiteLLM Proxy 100+ LLMs, Track Number of Requests, Avg Latency Per Model Deployment

🛠️ High Traffic Fixes - Fix for DB connection limit hits when model fallbacks occur

🚀 High Traffic Fixes - /embedding - bug "Dictionary changed size during iteration"

⚡️ High Traffic Fixes - Switch off --detailed_debug in default Dockerfile. Users will need to opt in to viewing --detailed_debug logs. (This led to a 5% decrease in avg latency across 1K concurrent calls)

📖 Docs - Fixes for /user/new on LiteLLM Proxy Swagger (show how to set tpm/rpm limits per user) https://docs.litellm.ai/docs/proxy/virtual_keys#usernew

⭐️ Admin UI - separate latency, num requests graphs for model deployments https://docs.litellm.ai/docs/proxy/ui

What's Changed

Contributors

v1.29.5

What's Changed

Contributors

v1.29.4

What's Changed

Contributors

3. If user is using `run_gunicorn` use cpu_count to select optimal `num_workers`