Skip to content

Commit

Permalink
Litellm 12 02 2024 (#6994)
Browse files Browse the repository at this point in the history
* add the logprobs param for fireworks ai (#6915)

* add the logprobs param for fireworks ai

* (feat) pass through llm endpoints - add `PATCH` support (vertex context caching requires for update ops)  (#6924)

* add PATCH for pass through endpoints

* test_pass_through_routes_support_all_methods

* sonnet supports pdf, haiku does not (#6928)

* (feat) DataDog Logger - Add Failure logging + use Standard Logging payload (#6929)

* add async_log_failure_event for dd

* use standard logging payload for DD logging

* use standard logging payload for DD

* fix use SLP status

* allow opting into _create_v0_logging_payload

* add unit tests for DD logging payload

* fix dd logging tests

* (feat) log proxy auth errors on datadog  (#6931)

* add new dd type for auth errors

* add async_log_proxy_authentication_errors

* fix comment

* use async_log_proxy_authentication_errors

* test_datadog_post_call_failure_hook

* test_async_log_proxy_authentication_errors

* (feat) Allow using include to include external YAML files in a config.yaml (#6922)

* add helper to process inlcudes directive on yaml

* add doc on config management

* unit tests for `include` on config.yaml

* bump: version 1.52.16 → 1.53.

* (feat) dd logger - set tags according to the values set by those env vars  (#6933)

* dd logger, inherit from .envs

* test_datadog_payload_environment_variables

* fix _get_datadog_service

* build(ui/): update ui build

* bump: version 1.53.0 → 1.53.1

* Revert "(feat) Allow using include to include external YAML files in a config.yaml (#6922)"

This reverts commit 68e5982.

* LiteLLM Minor Fixes & Improvements (11/26/2024)  (#6913)

* docs(config_settings.md): document all router_settings

* ci(config.yml): add router_settings doc test to ci/cd

* test: debug test on ci/cd

* test: debug ci/cd test

* test: fix test

* fix(team_endpoints.py): skip invalid team object. don't fail `/team/list` call

Causes downstream errors if ui just fails to load team list

* test(base_llm_unit_tests.py): add 'response_format={"type": "text"}' test to base_llm_unit_tests

adds complete coverage for all 'response_format' values to ci/cd

* feat(router.py): support wildcard routes in `get_router_model_info()`

Addresses #6914

* build(model_prices_and_context_window.json): add tpm/rpm limits for all gemini models

Allows for ratelimit tracking for gemini models even with wildcard routing enabled

Addresses #6914

* feat(router.py): add tpm/rpm tracking on success/failure to global_router

Addresses #6914

* feat(router.py): support wildcard routes on router.get_model_group_usage()

* fix(router.py): fix linting error

* fix(router.py): implement get_remaining_tokens_and_requests

Addresses #6914

* fix(router.py): fix linting errors

* test: fix test

* test: fix tests

* docs(config_settings.md): add missing dd env vars to docs

* fix(router.py): check if hidden params is dict

* LiteLLM Minor Fixes & Improvements (11/27/2024) (#6943)

* fix(http_parsing_utils.py): remove `ast.literal_eval()` from http utils

Security fix - https://huntr.com/bounties/96a32812-213c-4819-ba4e-36143d35e95b?token=bf414bbd77f8b346556e
64ab2dd9301ea44339910877ea50401c76f977e36cdd78272f5fb4ca852a88a7e832828aae1192df98680544ee24aa98f3cf6980d8
bab641a66b7ccbc02c0e7d4ddba2db4dbe7318889dc0098d8db2d639f345f574159814627bb084563bad472e2f990f825bff0878a9
e281e72c88b4bc5884d637d186c0d67c9987c57c3f0caf395aff07b89ad2b7220d1dd7d1b427fd2260b5f01090efce5250f8b56ea2
c0ec19916c24b23825d85ce119911275944c840a1340d69e23ca6a462da610

* fix(converse/transformation.py): support bedrock apac cross region inference

Fixes #6905

* fix(user_api_key_auth.py): add auth check for websocket endpoint

Fixes #6926

* fix(user_api_key_auth.py): use `model` from query param

* fix: fix linting error

* test: run flaky tests first

* docs: update the docs (#6923)

* (bug fix) /key/update was not storing `budget_duration` in the DB  (#6941)

* fix - store budget_duration for keys

* test_generate_and_update_key

* test_update_user_unit_test

* fix user update

* (fix) handle json decode errors for DD exception logging (#6934)

* fix JSONDecodeError

* handle async_log_proxy_authentication_errors

* fix test_async_log_proxy_authentication_errors_get_request

* Revert "Revert "(feat) Allow using include to include external YAML files in a config.yaml (#6922)""

This reverts commit 5d13302.

* (docs + fix) Add docs on Moderations endpoint, Text Completion  (#6947)

* fix _pass_through_moderation_endpoint_factory

* fix route_llm_request

* doc moderations api

* docs on /moderations

* add e2e tests for moderations api

* docs moderations api

* test_pass_through_moderation_endpoint_factory

* docs text completion

* (feat) add enforcement for unique key aliases on /key/update and /key/generate  (#6944)

* add enforcement for unique key aliases

* fix _enforce_unique_key_alias

* fix _enforce_unique_key_alias

* fix _enforce_unique_key_alias

* test_enforce_unique_key_alias

* (fix) tag merging / aggregation logic   (#6932)

* use 1 helper to merge tags + ensure unique ness

* test_add_litellm_data_to_request_duplicate_tags

* fix _merge_tags

* fix proxy utils test

* fix doc string

* (feat) Allow disabling ErrorLogs written to the DB  (#6940)

* fix - allow disabling logging error logs

* docs on disabling error logs

* doc string for _PROXY_failure_handler

* test_disable_error_logs

* rename file

* fix rename file

* increase test coverage for test_enable_error_logs

* fix(key_management_endpoints.py): support 'tags' param on `/key/update` (#6945)

* LiteLLM Minor Fixes & Improvements (11/29/2024)  (#6965)

* fix(factory.py): ensure tool call converts image url

Fixes #6953

* fix(transformation.py): support mp4 + pdf url's for vertex ai

Fixes #6936

* fix(http_handler.py): mask gemini api key in error logs

Fixes #6963

* docs(prometheus.md): update prometheus FAQs

* feat(auth_checks.py): ensure specific model access > wildcard model access

if wildcard model is in access group, but specific model is not - deny access

* fix(auth_checks.py): handle auth checks for team based model access groups

handles scenario where model access group used for wildcard models

* fix(internal_user_endpoints.py): support adding guardrails on `/user/update`

Fixes #6942

* fix(key_management_endpoints.py): fix prepare_metadata_fields helper

* fix: fix tests

* build(requirements.txt): bump openai dep version

fixes proxies argument

* test: fix tests

* fix(http_handler.py): fix error message masking

* fix(bedrock_guardrails.py): pass in prepped data

* test: fix test

* test: fix nvidia nim test

* fix(http_handler.py): return original response headers

* fix: revert maskedhttpstatuserror

* test: update tests

* test: cleanup test

* fix(key_management_endpoints.py): fix metadata field update logic

* fix(key_management_endpoints.py): maintain initial order of guardrails in key update

* fix(key_management_endpoints.py): handle prepare metadata

* fix: fix linting errors

* fix: fix linting errors

* fix: fix linting errors

* fix: fix key management errors

* fix(key_management_endpoints.py): update metadata

* test: update test

* refactor: add more debug statements

* test: skip flaky test

* test: fix test

* fix: fix test

* fix: fix update metadata logic

* fix: fix test

* ci(config.yml): change db url for e2e ui testing

* bump: version 1.53.1 → 1.53.2

* Updated config.yml

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: paul-gauthier <69695708+paul-gauthier@users.noreply.github.com>
Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
Co-authored-by: Sara Han <127759186+sdiazlor@users.noreply.github.com>

* fix(exceptions.py): ensure ratelimit error code == 429, type == "throttling_error"

Fixes #6973

* fix(utils.py): add jina ai dimensions embedding param support

Fixes #6591

* fix(exception_mapping_utils.py): add bedrock 'prompt is too long' exception to context window exceeded error exception mapping

Fixes #6629

Closes #6975

* fix(litellm_logging.py): strip trailing slash for api base

Closes #6859

* test: skip timeout issue

---------

Co-authored-by: ershang-dou <erlie.shang@gmail.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: paul-gauthier <69695708+paul-gauthier@users.noreply.github.com>
Co-authored-by: Sara Han <127759186+sdiazlor@users.noreply.github.com>
  • Loading branch information
5 people authored Dec 3, 2024
1 parent 1c8438d commit 89fcd7b
Show file tree
Hide file tree
Showing 11 changed files with 106 additions and 26 deletions.
2 changes: 2 additions & 0 deletions litellm/exceptions.py
Original file line number Diff line number Diff line change
Expand Up @@ -306,6 +306,8 @@ def __init__(
super().__init__(
self.message, response=self.response, body=None
) # Call the base class constructor with the parameters it needs
self.code = "429"
self.type = "throttling_error"

def __str__(self):
_message = self.message
Expand Down
1 change: 1 addition & 0 deletions litellm/litellm_core_utils/exception_mapping_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -732,6 +732,7 @@ def exception_type( # type: ignore # noqa: PLR0915
"too many tokens" in error_str
or "expected maxLength:" in error_str
or "Input is too long" in error_str
or "prompt is too long" in error_str
or "prompt: length: 1.." in error_str
or "Too many input tokens" in error_str
):
Expand Down
3 changes: 3 additions & 0 deletions litellm/litellm_core_utils/get_supported_openai_params.py
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,9 @@ def get_supported_openai_params( # noqa: PLR0915
]
elif custom_llm_provider == "huggingface":
return litellm.HuggingfaceConfig().get_supported_openai_params()
elif custom_llm_provider == "jina_ai":
if request_type == "embeddings":
return litellm.JinaAIEmbeddingConfig().get_supported_openai_params()
elif custom_llm_provider == "together_ai":
return litellm.TogetherAIConfig().get_supported_openai_params(model=model)
elif custom_llm_provider == "ai21":
Expand Down
11 changes: 10 additions & 1 deletion litellm/litellm_core_utils/litellm_logging.py
Original file line number Diff line number Diff line change
Expand Up @@ -2680,6 +2680,12 @@ def get_hidden_params(
clean_hidden_params[key] = hidden_params[key] # type: ignore
return clean_hidden_params

@staticmethod
def strip_trailing_slash(api_base: Optional[str]) -> Optional[str]:
if api_base:
return api_base.rstrip("/")
return api_base


def get_standard_logging_object_payload(
kwargs: Optional[dict],
Expand Down Expand Up @@ -2811,7 +2817,10 @@ def get_standard_logging_object_payload(
completion_tokens=usage.completion_tokens,
request_tags=request_tags,
end_user=end_user_id or "",
api_base=litellm_params.get("api_base", ""),
api_base=StandardLoggingPayloadSetup.strip_trailing_slash(
litellm_params.get("api_base", "")
)
or "",
model_group=_model_group,
model_id=_model_id,
requester_ip_address=clean_metadata.get("requester_ip_address", None),
Expand Down
3 changes: 3 additions & 0 deletions litellm/llms/fireworks_ai/chat/fireworks_ai_transformation.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ class FireworksAIConfig:
stop: Optional[Union[str, list]] = None
response_format: Optional[dict] = None
user: Optional[str] = None
logprobs: Optional[int] = None

# Non OpenAI parameters - Fireworks AI only params
prompt_truncate_length: Optional[int] = None
Expand All @@ -44,6 +45,7 @@ def __init__(
stop: Optional[Union[str, list]] = None,
response_format: Optional[dict] = None,
user: Optional[str] = None,
logprobs: Optional[int] = None,
prompt_truncate_length: Optional[int] = None,
context_length_exceeded_behavior: Optional[Literal["error", "truncate"]] = None,
) -> None:
Expand Down Expand Up @@ -86,6 +88,7 @@ def get_supported_openai_params(self):
"stop",
"response_format",
"user",
"logprobs",
"prompt_truncate_length",
"context_length_exceeded_behavior",
]
Expand Down
14 changes: 13 additions & 1 deletion litellm/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -2436,6 +2436,18 @@ def _check_valid_arg(supported_params: Optional[list]):
)
final_params = {**optional_params, **kwargs}
return final_params
elif custom_llm_provider == "jina_ai":
supported_params = get_supported_openai_params(
model=model,
custom_llm_provider="jina_ai",
request_type="embeddings",
)
_check_valid_arg(supported_params=supported_params)
optional_params = litellm.JinaAIEmbeddingConfig().map_openai_params(
non_default_params=non_default_params, optional_params={}
)
final_params = {**optional_params, **kwargs}
return final_params
elif custom_llm_provider == "fireworks_ai":
supported_params = get_supported_openai_params(
model=model,
Expand Down Expand Up @@ -2464,7 +2476,7 @@ def _check_valid_arg(supported_params: Optional[list]):
else:
raise UnsupportedParamsError(
status_code=500,
message=f"Setting user/encoding format is not supported by {custom_llm_provider}. To drop it from the call, set `litellm.drop_params = True`.",
message=f"Setting {non_default_params} is not supported by {custom_llm_provider}. To drop it from the call, set `litellm.drop_params = True`.",
)
final_params = {**non_default_params, **kwargs}
return final_params
Expand Down
9 changes: 9 additions & 0 deletions tests/llm_translation/test_bedrock_completion.py
Original file line number Diff line number Diff line change
Expand Up @@ -1934,3 +1934,12 @@ def test_bedrock_completion_test_4(modify_params):
with pytest.raises(Exception) as e:
litellm.completion(**data)
assert "litellm.modify_params" in str(e.value)


def test_bedrock_context_window_error():
with pytest.raises(litellm.ContextWindowExceededError) as e:
litellm.completion(
model="bedrock/claude-3-5-sonnet-20240620",
messages=[{"role": "user", "content": "Hello, world!"}],
mock_response=Exception("prompt is too long"),
)
9 changes: 9 additions & 0 deletions tests/llm_translation/test_jina_ai.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,12 @@ def get_base_rerank_call_args(self) -> dict:
return {
"model": "jina_ai/jina-reranker-v2-base-multilingual",
}


def test_jina_ai_embedding():
litellm.embedding(
model="jina_ai/jina-embeddings-v3",
input=["a"],
task="separation",
dimensions=1024,
)
13 changes: 13 additions & 0 deletions tests/local_testing/test_exceptions.py
Original file line number Diff line number Diff line change
Expand Up @@ -1176,3 +1176,16 @@ async def test_bad_request_error_contains_httpx_response(model):
print("e.response", e.response)
print("vars(e.response)", vars(e.response))
assert e.response is not None


def test_exceptions_base_class():
try:
raise litellm.RateLimitError(
message="BedrockException: Rate Limit Error",
model="model",
llm_provider="bedrock",
)
except litellm.RateLimitError as e:
assert isinstance(e, litellm.RateLimitError)
assert e.code == "429"
assert e.type == "throttling_error"
55 changes: 31 additions & 24 deletions tests/local_testing/test_function_calling.py
Original file line number Diff line number Diff line change
Expand Up @@ -624,6 +624,7 @@ def test_passing_tool_result_as_list(model):

@pytest.mark.parametrize("sync_mode", [True, False])
@pytest.mark.asyncio
@pytest.mark.flaky(retries=6, delay=1)
async def test_watsonx_tool_choice(sync_mode):
from litellm.llms.custom_httpx.http_handler import HTTPHandler, AsyncHTTPHandler
import json
Expand Down Expand Up @@ -654,28 +655,34 @@ async def test_watsonx_tool_choice(sync_mode):

client = HTTPHandler() if sync_mode else AsyncHTTPHandler()
with patch.object(client, "post", return_value=MagicMock()) as mock_completion:
try:
if sync_mode:
resp = completion(
model="watsonx/meta-llama/llama-3-1-8b-instruct",
messages=messages,
tools=tools,
tool_choice="auto",
client=client,
)
else:
resp = await acompletion(
model="watsonx/meta-llama/llama-3-1-8b-instruct",
messages=messages,
tools=tools,
tool_choice="auto",
client=client,
stream=True,
)

if sync_mode:
resp = completion(
model="watsonx/meta-llama/llama-3-1-8b-instruct",
messages=messages,
tools=tools,
tool_choice="auto",
client=client,
)
else:
resp = await acompletion(
model="watsonx/meta-llama/llama-3-1-8b-instruct",
messages=messages,
tools=tools,
tool_choice="auto",
client=client,
stream=True,
)

print(resp)

mock_completion.assert_called_once()
print(mock_completion.call_args.kwargs)
json_data = json.loads(mock_completion.call_args.kwargs["data"])
json_data["tool_choice_options"] == "auto"
print(resp)

mock_completion.assert_called_once()
print(mock_completion.call_args.kwargs)
json_data = json.loads(mock_completion.call_args.kwargs["data"])
json_data["tool_choice_options"] == "auto"
except Exception as e:
print(e)
if "The read operation timed out" in str(e):
pytest.skip("Skipping test due to timeout")
else:
raise e
12 changes: 12 additions & 0 deletions tests/logging_callback_tests/test_standard_logging_payload.py
Original file line number Diff line number Diff line change
Expand Up @@ -319,3 +319,15 @@ def test_get_final_response_obj():
finally:
# Reset litellm.turn_off_message_logging to its original value
litellm.turn_off_message_logging = False


def test_strip_trailing_slash():
common_api_base = "https://api.test.com"
assert (
StandardLoggingPayloadSetup.strip_trailing_slash(common_api_base + "/")
== common_api_base
)
assert (
StandardLoggingPayloadSetup.strip_trailing_slash(common_api_base)
== common_api_base
)

0 comments on commit 89fcd7b

Please sign in to comment.