Releases: BerriAI/litellm
v1.32.4
Full Changelog: v1.32.3...v1.32.4
🚨 Nightly Build - We noticed testing was flaky on this release
fix(proxy/utils.py): move to batch writing db updates by @krrishdholakia in #2561
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 88 | 94.56061238333339 | 1.6031443291109193 | 0.0 | 480 | 0 | 83.96878100001004 | 1119.2803200000014 |
/health/liveliness | Passed ✅ | 66 | 68.92021606940695 | 14.869163652503778 | 0.0 | 4452 | 0 | 63.53916099999424 | 1227.7072139999632 |
/health/readiness | Passed ✅ | 66 | 68.07316587537812 | 15.436943935730561 | 0.0 | 4622 | 0 | 63.491317000000436 | 1315.2673790000051 |
Aggregated | Passed ✅ | 66 | 69.79862555589267 | 31.90925191734526 | 0.0 | 9554 | 0 | 63.491317000000436 | 1315.2673790000051 |
v1.32.3
🚨 Nightly Build - We noticed testing was flaky on this release
https://docs.litellm.ai/docs/proxy/logging
LiteLLM v1.32.3 📈 Proxy 100+ LLMs in one format + Send logs to Datadog. Start here: https://docs.litellm.ai/docs/proxy/logging
👉 Admin UI: Bug Fix for viewing total spend on LiteLLM Proxy https://docs.litellm.ai/docs/proxy/ui
💵 New /global/spend endpoint -> get total spend on proxy, total proxy budget if set
📖 Docs fix - view Pre Call hooks https://docs.litellm.ai/docs/proxy/call_hooks
📖 Docs - Using LiteLLM Proxy + Datadog for sending LLM logs:
https://docs.litellm.ai/docs/proxy/logging#logging-proxy-inputoutput---datadog
What's Changed
- fix(caching.py): pass redis kwargs to connection pool init by @krrishdholakia in #2573
- Support multiple system message tranlation for bedrock claude-3 by @garfeildma in #2570
- [FEAT] DataDog Logging Provider by @ishaan-jaff in #2578
- [Docs] datadog docs + LiteLLM Proxy by @ishaan-jaff in #2579
- fix bug: custom prompt templates registered are never applied to vllm provider by @zdsfwy in #2548
- improving non-openai tool_call prompt by @TheDiscoMole in #1527
- feat(vertex_ai.py): support gemini (vertex ai) function calling when streaming by @krrishdholakia in #2577
- [FIX] UI show global spend - when proxy budget not set by @ishaan-jaff in #2581
- fix(proxy/utils.py): move to batch writing db updates by @krrishdholakia in #2561
New Contributors
- @garfeildma made their first contribution in #2570
- @zdsfwy made their first contribution in #2548
- @TheDiscoMole made their first contribution in #1527
Full Changelog: v1.32.1...v1.32.3
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 87 | 95.22148051583896 | 1.4766004734948956 | 0.0 | 442 | 0 | 81.41332799999645 | 1162.7792299999555 |
/health/liveliness | Passed ✅ | 62 | 65.23377382330264 | 15.597845273184769 | 0.006681450106311745 | 4669 | 2 | 59.51579199995649 | 1170.440439999993 |
/health/readiness | Passed ✅ | 62 | 65.01079326000487 | 15.110099415424012 | 0.0 | 4523 | 0 | 59.88310399993679 | 1250.393838999912 |
Aggregated | Passed ✅ | 62 | 66.5048995520034 | 32.184545162103674 | 0.006681450106311745 | 9634 | 2 | 59.51579199995649 | 1250.393838999912 |
v1.32.1
What's Changed
- fix(utils.py): initial commit for aws secret manager support by @krrishdholakia in #2556
- Add functions and tools to input for langfuse by @webkonstantin in #2460
- Admin UI - view team based spend by @ishaan-jaff in #2560
New Contributors
- @webkonstantin made their first contribution in #2460
Full Changelog: v1.31.17...v1.32.1
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 100.0 | 107.18687626744044 | 1.4363402664466636 | 0.0 | 430 | 0 | 94.8806760000025 | 798.0420060000029 |
/health/liveliness | Passed ✅ | 78 | 79.6030075775465 | 15.442328027402151 | 0.0 | 4623 | 0 | 73.78508599998668 | 1082.178856999974 |
/health/readiness | Passed ✅ | 78 | 80.91262149559073 | 15.151719647911781 | 0.0033403262010387523 | 4536 | 1 | 73.9213740000082 | 1344.059859999959 |
Aggregated | Passed ✅ | 78 | 81.45945478464874 | 32.030387941760594 | 0.0033403262010387523 | 9589 | 1 | 73.78508599998668 | 1344.059859999959 |
v1.31.17
💵 Track LLM Spend per Team, Start here: https://docs.litellm.ai/docs/simple_proxy
🐳 New LiteLLM Helm Chart hosted on OCI GHCR: https://github.com/BerriAI/litellm/pkgs/container/litellm-helm
🚀 /health/readiness avg response time is now 93% faster
🛠️ Fix for /health/readiness when it returns large json as part of the success callback
📖 Docs on deploying LiteLLM with Helm Chart https://docs.litellm.ai/docs/proxy/deploy#quick-start
What's Changed
- Add function call result submission support for Claude 3 models by @lazyhope in #2527
- Update helm chart to accomodate recent project changes by @ShaunMaher in #2145
- [FEAT] LiteLLM Helm Chart hosted on ghcr + docs by @ishaan-jaff in #2551
- fix(proxy_server.py): write blocked user list to a db table by @krrishdholakia in #2552
- Litellm end user opt out v2 db by @krrishdholakia in #2554
- (feat) Proxy - improve health readiness perf (93% faster) by @ishaan-jaff in #2553
- (fix) /health/readiness return success callback names as (str) by @ishaan-jaff in #2557
- (fix) admin ui - order spend by date by @ishaan-jaff in #2559
New Contributors
Full Changelog: v1.31.16...v1.31.17
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 100.0 | 109.5266968426512 | 1.6136745626072975 | 0.0 | 483 | 0 | 94.7443700000008 | 693.4165350000399 |
/health/liveliness | Passed ✅ | 78 | 80.0886455294508 | 15.03089411422408 | 0.006681882246821108 | 4499 | 2 | 74.07101500001545 | 1331.0514590000082 |
/health/readiness | Passed ✅ | 78 | 79.42774233022519 | 15.408420461169474 | 0.0 | 4612 | 0 | 73.88309300000628 | 877.4356640000178 |
Aggregated | Passed ✅ | 78 | 81.2529662746507 | 32.052989138000854 | 0.006681882246821108 | 9594 | 2 | 73.88309300000628 | 1331.0514590000082 |
v1.31.16
What's Changed
- [Docs+Fixes] Litellm helm chart use k8 1.21 by @ishaan-jaff in #2544
- docs(langfuse): add chatlitellm section by @udit-001 in #2541
- 89% Caching improvement - Async Redis completion calls + batch redis GET requests for a given key + call type by @krrishdholakia in #2542
New Contributors
Full Changelog: v1.31.15...v1.31.16
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 89 | 99.65921707220798 | 1.52648108721305 | 0.0 | 457 | 0 | 81.2413830000196 | 1340.1977020000118 |
/health/liveliness | Passed ✅ | 62 | 65.55824947976126 | 15.101140033457767 | 0.003340221197402735 | 4521 | 1 | 59.555162999970435 | 1363.859160000004 |
/health/readiness | Passed ✅ | 190.0 | 185.8658501946554 | 15.37503817164479 | 0.010020663592208207 | 4603 | 3 | 124.73937800001522 | 1249.4980939999891 |
Aggregated | Passed ✅ | 100 | 124.98419961861957 | 32.002659292315606 | 0.01336088478961094 | 9581 | 4 | 59.555162999970435 | 1363.859160000004 |
v1.31.15
What's Changed
- (fix) - update user error by @ishaan-jaff in #2524
- Exclude /docs to reduce Docker image size by @Hitro147 in #2534
- feat(utils.py): add native fireworks ai support by @krrishdholakia in #2535
- (fix) Proxy - fix error message raised on passing invalid tokens by @ishaan-jaff in #2540
New Contributors
Full Changelog: v1.31.14...v1.31.15
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 51 | 56.56456498532438 | 1.5930134738840243 | 0.0 | 477 | 0 | 45.269316999991815 | 622.420634000008 |
/health/liveliness | Passed ✅ | 25 | 27.461155180352375 | 15.536055933980043 | 0.0 | 4652 | 0 | 23.3922579999728 | 1113.6097229999962 |
/health/readiness | Passed ✅ | 87 | 93.05384415144268 | 15.282242466442966 | 0.0 | 4576 | 0 | 82.33251799998698 | 1043.947385000024 |
Aggregated | Passed ✅ | 52 | 59.81916354806835 | 32.41131187430703 | 0.0 | 9705 | 0 | 23.3922579999728 | 1113.6097229999962 |
v1.31.14
What's Changed
- (feat) add groq/gemma-7b-it by @snekkenull in #2529
New Contributors
- @snekkenull made their first contribution in #2529
Full Changelog: v1.31.13...v1.31.14
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 88 | 97.8719389956135 | 1.5238205426081837 | 0.0 | 456 | 0 | 80.47983800003067 | 1042.1901939999998 |
/health/liveliness | Passed ✅ | 62 | 64.17129267745518 | 15.178054615189408 | 0.0 | 4542 | 0 | 59.447291000026325 | 1284.6712140000136 |
/health/readiness | Passed ✅ | 180.0 | 180.3285143601241 | 14.967526777065908 | 0.0 | 4479 | 0 | 123.18138000000545 | 1374.4275610000045 |
Aggregated | Passed ✅ | 90 | 120.690833738736 | 31.6694019348635 | 0.0 | 9477 | 0 | 59.447291000026325 | 1374.4275610000045 |
v1.31.13
What's Changed
- fix(utils.py): move to using
litellm.modify_params
to enable max output token trimming fix by @krrishdholakia in #2520 - (fix) multiplatform docker db builds by @ishaan-jaff in #2525
Full Changelog: v1.31.12...v1.31.13
v1.31.12
What's Changed
- (fix) run prisma generate in default dockerfile by @ishaan-jaff in #2519
Full Changelog: v1.31.10...v1.31.12
v1.31.10
Full Changelog: v1.31.9...v1.31.10