Releases: BerriAI/litellm
v1.16.15
litellm 1.16.5
What's Changed
- use s3 Buckets for caching /chat/completion, embedding responses. Proxy Caching: https://docs.litellm.ai/docs/proxy/caching, Caching with
litellm.completion
https://docs.litellm.ai/docs/caching/redis_cache litellm.completion_cost()
Support for cost calculation for embedding responses - Azure embedding, andtext-embedding-ada-002-v2
@jeromeroussin
async def _test():
response = await litellm.aembedding(
model="azure/azure-embedding-model",
input=["good morning from litellm", "gm"],
)
print(response)
return response
response = asyncio.run(_test())
cost = litellm.completion_cost(completion_response=response)
litellm.completion_cost()
raises exceptions (instead of swallowing exceptions) @jeromeroussin- Improved token counting for azure streaming responses @langgg0511 #1304
- set os.environ/ variables for litellm proxy cache @Manouchehri
model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: gpt-3.5-turbo
- model_name: text-embedding-ada-002
litellm_params:
model: text-embedding-ada-002
litellm_settings:
set_verbose: True
cache: True # set cache responses to True
cache_params: # set cache params for s3
type: s3
s3_bucket_name: cache-bucket-litellm # AWS Bucket Name for S3
s3_region_name: us-west-2 # AWS Region Name for S3
s3_aws_access_key_id: os.environ/AWS_ACCESS_KEY_ID # us os.environ/<variable name> to pass environment variables. This is AWS Access Key ID for S3
s3_aws_secret_access_key: os.environ/AWS_SECRET_ACCESS_KEY # AWS Secret Access Key for S3
- build(Dockerfile): moves prisma logic to dockerfile by @krrishdholakia in #1342
Full Changelog: 1.16.14...v1.16.15
1.16.14
Full Changelog: v1.16.13...1.16.14
1.16.12
New providers
Xinference Embeddings: https://docs.litellm.ai/docs/providers/xinference
Voyage AI: https://docs.litellm.ai/docs/providers/voyage
Cloudflare AI workers: https://docs.litellm.ai/docs/providers/cloudflare_workers
Fixes:
AWS region name error when passing user bedrock client: #1292
Azure OpenAI models - use correct context window in model_context_window_and_prices.json
Fixes for Azure OpenAI + Streaming - counting prompt tokens correctly: #1264
What's Changed
- Update mistral.md by @nanowell in #1157
- Use current Git folder for building Dockerfile by @Manouchehri in #1076
- (fix) curl example on proxy/quick_start by @ku-suke in #1163
- clarify the need to set an exporter by @nirga in #1162
- Update mistral.md by @ericmjl in #1174
- doc: updated langfuse ver 1.14 in pip install cmd by @speedyankur in #1178
- Fix for issue that occured when proxying to ollama by @clevcode in #1168
- Sample code to prevent logging API key in callback to Slack by @navidre in #1185
- Add partial support of VertexAI safety settings by @neubig in #1190
- docker build and push on release by @Rested in #1197
- Add a default for safety settings in vertex AI by @neubig in #1199
- Make vertex ai work with generate_content by @neubig in #1213
- fix: success_callback logic for cost_tracking by @sihyeonn in #1211
- Add aws_bedrock_runtime_endpoint support by @Manouchehri in #1203
- fix least_busy router by updating min_traffic by @AllentDan in #1195
- Improve langfuse integration by @maxdeichmann in #1183
- usage_based_routing_fix by @sumanth13131 in #1182
- add some GitHub workflows for flake8 and add black dependecy by @bufferoverflow in #1223
- fix: helicone logging by @evantancy in #1249
- update anyscale price link by @prateeksachan in #1246
- Bump aiohttp from 3.8.4 to 3.9.0 by @dependabot in #1180
- updated oobabooga to new api and support for embeddings by @danikhan632 in #1248
- add support for mistral json mode via anyscale by @marmikcfc in #1275
- Adds support for Vertex AI Unicorn by @AshGreh in #1277
- fix typos & add missing names for azure models by @fcakyon in #1290
- fix(proxy_server.py) Check when '_hidden_params' is None by @asedmammad in #1300
New Contributors
- @nanowell made their first contribution in #1157
- @ku-suke made their first contribution in #1163
- @ericmjl made their first contribution in #1174
- @speedyankur made their first contribution in #1178
- @clevcode made their first contribution in #1168
- @navidre made their first contribution in #1185
- @neubig made their first contribution in #1190
- @Rested made their first contribution in #1197
- @sihyeonn made their first contribution in #1211
- @AllentDan made their first contribution in #1195
- @sumanth13131 made their first contribution in #1182
- @bufferoverflow made their first contribution in #1223
- @evantancy made their first contribution in #1249
- @prateeksachan made their first contribution in #1246
- @danikhan632 made their first contribution in #1248
- @marmikcfc made their first contribution in #1275
- @AshGreh made their first contribution in #1277
- @fcakyon made their first contribution in #1290
- @asedmammad made their first contribution in #1300
Full Changelog: v1.15.0...1.16.12
v1.15.0
What's Changed
LiteLLM Proxy now maps exceptions for 100+ LLMs to the OpenAI format https://docs.litellm.ai/docs/proxy/quick_start
🧨 Log all LLM Input/Output to @dynamodb set litellm.success_callback = ["dynamodb"]
https://docs.litellm.ai/docs/proxy/logging#logging-proxy-inputoutput---dynamodb
⭐️ Support for @MistralAI API, Gemini PRO
🔎 Set Aliases for model groups on LiteLLM Proxy
🔎 Exception mapping for openai.NotFoundError live now + testing for exception mapping on proxy added to LiteLLM ci/cd https://docs.litellm.ai/docs/exception_mapping
⚙️ Fixes for async + streaming caching https://docs.litellm.ai/docs/proxy/caching
👉 Support for using Async logging with @langfuse live on proxy
AI Generated Release Notes
- Enable setting default
model
value forLiteLLM
,Chat
,Completions
by @estill01 in #985 - fix replicate system prompt: forgot to add **optional_params to input data by @nbaldwin98 in #1080
- Update factory.py to fix issue when calling from write-the -> langchain -> litellm served ollama by @James4Ever0 in #1054
- Update Dockerfile to preinstall Prisma CLI by @Manouchehri in #1039
- build(deps): bump aiohttp from 3.8.6 to 3.9.0 by @dependabot in #937
- multistage docker build by @wallies in #995
- fix: traceloop links by @nirga in #1123
- refactor: add CustomStreamWrapper return type for completion by @Undertone0809 in #1112
- fix langfuse tests by @maxdeichmann in #1097
- Fix #1119, no content when streaming. by @emsi in #1122
- docs(projects): add Docq to 'projects built on..' section by @janaka in #1142
- docs(projects): add Docq.AI to sidebar nav by @janaka in #1143
New Contributors
- @James4Ever0 made their first contribution in #1054
- @wallies made their first contribution in #995
- @maxdeichmann made their first contribution in #1097
- @emsi made their first contribution in #1122
- @janaka made their first contribution in #1142
Full Changelog: v1.11.1...v1.15.0
v1.11.1
Proxy
- Bug fix for non OpenAI LLMs on proxy
- Major stability improvements & Fixes + added test cases for proxy
- Async success/failure loggers
- Support for using custom loggers with
aembedding()
What's Changed
- feat: add docker compose file and running guide by @geekyayush in #993
- (feat) Speedup health endpoint by @PSU3D0 in #1023
- (pricing) Add Claude v2.1 for Bedrock by @Manouchehri in #1042
New Contributors
- @geekyayush made their first contribution in #993
Full Changelog: v1.10.4...v1.11.1
v1.10.4
Note: Proxy Server on 1.10.4 has a bug for non OpenAI LLMs - Fixed on 1.10.11
Updates Proxy Server
- Use custom callbacks on the proxy https://docs.litellm.ai/docs/proxy/logging
- Set
timeout
andstream_timeout
per model https://docs.litellm.ai/docs/proxy/load_balancing#custom-timeouts-stream-timeouts---per-model - Stability: Added testing for reading config.yaml on the proxy
- NEW
/model/new
+/model/info
endpoints - Add new models + Get model info without restarting proxy. - Custom user auth - #898 (comment)
- Key Security -> keys now stored as just hashes in the db
- user id accepted + passed to OpenAI/Azure
litellm
Package
- Specify
kwargs
for Redis Cache 9ba1765 - Fixes for Sagemaker + Palm Streaming
- Support for async custom callbacks - https://docs.litellm.ai/docs/observability/custom_callback#async-callback-functions
- Major improvements to stream chunk builder - support for parallel tool calling, system fingerprints, etc.
- Fixes for azure / openai streaming (return complete response object)
- Support for loading keys from azure key vault - https://docs.litellm.ai/docs/secret#azure-key-vault
What's Changed
- docs: adds gpt-3.5-turbo-1106 in supported models by @rishabgit in #958
- (feat) Allow installing proxy dependencies explicitly with
pip install litellm[proxy]
by @PSU3D0 in #966 - Mention Neon as a database option in docs by @Manouchehri in #977
- fix system prompts for replicate by @nbaldwin98 in #970
New Contributors
- @rishabgit made their first contribution in #958
- @PSU3D0 made their first contribution in #966
- @nbaldwin98 made their first contribution in #970
Full Changelog: v1.7.11...v1.10.4
v1.7.11
💥 LiteLLM Router + Proxy handles 500+ requests/second
💥LiteLLM Proxy - Now handles 500+ requests/second, Load Balance Azure + OpenAI deployments, Track spend per user 💥
Try it here: https://docs.litellm.ai/docs/simple_proxy
🔑 Support for AZURE_OPENAI_API_KEY
on Azure https://docs.litellm.ai/docs/providers/azure
h/t
@solyarisoftware
⚡️ LiteLLM Router can now handle 20% more throughput https://docs.litellm.ai/docs/routing
📖Improvement to litellm debugging docs h/t
@solyarisoftware
https://docs.litellm.ai/docs/debugging/local_debugging
Full Changelog: v1.7.1...v1.7.11
v1.7.1
What's Changed
- 🚨 LiteLLM Proxy uses Async completion/embedding calls on this release onwards - this led to 30x more throughput for embedding/completion calls
New Contributors
- @guspan-tanadi made their first contribution in #851
- @Manouchehri made their first contribution in #880
- @maqsoodshaik made their first contribution in #884
- @okotek made their first contribution in #885
- @kumaranvpl made their first contribution in #902
Full Changelog: v1.1.0...v1.7.1
v1.1.0
What's Changed
🚨 Breaking Change v1.1.0 -> This version is only compatible with OpenAI python 1.1.0
Migration Guide: https://docs.litellm.ai/docs/migration
Key changes in v1.1.0
- Requires
openai>=1.0.0
openai.InvalidRequestError
→openai.BadRequestError
openai.ServiceUnavailableError
→openai.APIStatusError
- NEW litellm client, allow users to pass api_key
litellm.Litellm(api_key="sk-123")
- response objects now inherit from
BaseModel
(prev.OpenAIObject
) - NEW default exception -
APIConnectionError
(prev.APIError
) - litellm.get_max_tokens() now returns an int not a dict
max_tokens = litellm.get_max_tokens("gpt-3.5-turbo") # returns an int not a dict assert max_tokens==4097
Other updates
- Update function calling docs by @kevinjyee in #673
- Fix data being overwritten by @mc-marcocheng in #679
- Updating the docker image builder for GitHub Action by @coconut49 in #678
- fix: bugs in traceloop integration by @nirga in #647
- Router aembedding by @mc-marcocheng in #691
- support release and debug params for langfuse client by @SlapDrone in #695
- docs error ==> openai.error instead of openai.errors by @josearangos in #700
- refactor Contributing to documentation steps by @josearangos in #713
- Fix Router.set_model_list & Avoid overwriting litellm_params by @mc-marcocheng in #706
- Update Together AI pricing by @dedeswim in #724
- Update README.md by @chinmay7016 in #727
- Router.get_available_deployment: Handle empty input edge case by @mc-marcocheng in #729
- Fix caching for Router by @karvetskiy in #722
- support for custom bedrock runtime endpoint by @canada4663 in #717
- Use supplied headers by @stanfea in #741
- Docker Hub image is built for ARM64 only by @morgendigital in #734
- doc name chagne by @kylehh in #764
- fix: fix bug for the case --model is not specified by @clalanliu in #781
- add custom open ai models to asyncio call by @PrathamSoni in #789
- Fix bad returns in get_available_deployment by @nathankim7 in #790
- Improve message trimming by @duc-phamh in #787
- build(deps): bump postcss from 8.4.27 to 8.4.31 in /docs/my-website by @dependabot in #804
- build(deps): bump urllib3 from 2.0.5 to 2.0.7 by @dependabot in #805
- build(deps): bump @babel/traverse from 7.22.10 to 7.23.3 in /docs/my-website by @dependabot in #806
- Fix ServiceUnavailableError super.init error by @jackmpcollins in #813
- Update Together prices by @dedeswim in #814
- need to re-attempt backoff and yaml imports if the first import attempt fails by @kfsone in #820
- Fix typo for initial_prompt_value and too many values to unpack error by @rodneyxr in #826
- Bedrock llama by @dchristian3188 in #811
- build(deps): bump sharp from 0.32.5 to 0.32.6 in /docs/my-website by @dependabot in #832
New Contributors
- @kevinjyee made their first contribution in #673
- @mc-marcocheng made their first contribution in #679
- @SlapDrone made their first contribution in #695
- @josearangos made their first contribution in #700
- @dedeswim made their first contribution in #724
- @chinmay7016 made their first contribution in #727
- @karvetskiy made their first contribution in #722
- @stanfea made their first contribution in #741
- @morgendigital made their first contribution in #734
- @clalanliu made their first contribution in #781
- @PrathamSoni made their first contribution in #789
- @nathankim7 made their first contribution in #790
- @duc-phamh made their first contribution in #787
- @dependabot made their first contribution in #804
- @jackmpcollins made their first contribution in #813
- @kfsone made their first contribution in #820
- @rodneyxr made their first contribution in #826
- @dchristian3188 made their first contribution in #811
Full Changelog: v0.11.1...v1.1.0
v0.11.1
What's Changed
- Update init.py model_list to include bedrock models by @canada4663 in #609
- proxy /models endpoint with the results of get_valid_models() by @canada4663 in #611
- fix: llm_provider add openai finetune compatibility by @Undertone0809 in #618
- Update README.md by @Shivam250702 in #620
- Verbose warning by @toniengelhardt in #625
- Update the Dockerfile of the LiteLLM Proxy server and some refactorings by @coconut49 in #628
- fix: updates to traceloop docs by @nirga in #639
- docs: fixed typo in Traceloop docs by @nirga in #640
- fix: disabled batch by default for Traceloop by @nirga in #643
- Create GitHub Action to automatically build docker images by @coconut49 in #634
- Tutorial for using LiteLLM within Gradio Chatbot Application by @dcruiz01 in #645
- proxy server: fix langroid part by @pchalasani in #652
- Create GitHub Action to automatically build docker images by @coconut49 in #655
- deepinfra: Add supported models by @ichernev in #638
- Update index.md by @Pratikdate in #663
- Add perplexity namespace to model pricing dict by @toniengelhardt in #665
- Incorrect boto3 parameter name by @shrikant14 in #671
New Contributors
- @Undertone0809 made their first contribution in #618
- @Shivam250702 made their first contribution in #620
- @dcruiz01 made their first contribution in #645
- @ichernev made their first contribution in #638
- @Pratikdate made their first contribution in #663
- @shrikant14 made their first contribution in #671
Full Changelog: v0.8.4...v0.11.1