Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Spend logs are not saved correctly when using stream=true with stream_options: { include_usage: true } #6633

Open
DamnDaniel7 opened this issue Nov 7, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@DamnDaniel7
Copy link

What happened?

Description

When making requests with stream=true and stream_options: { include_usage: true } enabled, the request_id column in the spend logs is empty, causing only the first request to be saved. Subsequent requests fail to be logged correctly due to request_id being a primary key. However, when making requests with only stream=true (without stream_options: { include_usage: true }), the spend logs are saved correctly, with request_id populated as expected.

Steps to Reproduce

  1. Enable the spend tracking feature that saves spend logs to a database.
  2. Make a request with the following parameters:
    • stream=true
    • stream_options: { include_usage: true }
  3. Observe the spend logs saved in the database.

Expected Result:

  • Each successful request should have a unique request_id populated and logged in the spend logs.

Actual Result:

  • Only the first request is saved in the spend logs. The request_id is empty for subsequent requests, preventing additional entries due to it being the primary key.

Environment

  • Litellm version: v1.50.2-stable
  • Database type and version: PostgreSQL 14

config.yaml

  1. Example of config.yaml used:
model_list:
        - model_name: gpt4o
          litellm_params:
            model: azure/gpt-4o
            api_base: <api_base>
            api_key: <api_key>
            api_version: 2024-09-01-preview
          model_info:
            base_model: azure/gpt-4o
        - model_name: gpt4o-mini
          litellm_params:
            model: azure/gpt-4o-mini
            api_base: <api_base>
            api_key: <api_key>
            api_version: 2024-09-01-preview
          model_info:
            base_model: azure/gpt-4o-mini

Relevant log output

22:51:02 - LiteLLM Proxy:DEBUG: proxy_server.py:3273 - Request received by LiteLLM:
{
    "model": "gpt4o-mini",
    "user": "user123",
    "stream": true,
    "stream_options": {
        "include_usage": true
    },
    "messages": [
        {
            "role": "system",
            "content": "You are a snarky assistant."
        },
        {
            "role": "user",
            "content": "tell me a joke"
        }
    ]
}
...
22:51:04 - LiteLLM Proxy:DEBUG: proxy_server.py:757 - INSIDE _PROXY_track_cost_callback
22:51:04 - LiteLLM Proxy:DEBUG: proxy_server.py:761 - Proxy: In track_cost_callback for: kwargs={'model': 'gpt-4o-mini', 'messages': [{'role': 'system', 'content': 'You are a snarky assistant.'}, {'role': 'user', 'content': 'tell me a joke'}], 'optional_params': {'stream': True, 'stream_options': {'include_usage': True}, 'user': '[REDACTED_USER]', 'extra_body': {}}, 'litellm_params': {'acompletion': True, 'api_key': '[REDACTED_API_KEY]', 'force_timeout': 600, 'logger_fn': None, 'verbose': False, 'custom_llm_provider': 'azure', 'api_base': '[REDACTED_API_BASE]', 'litellm_call_id': '[REDACTED_CALL_ID]', 'model_alias_map': {}, 'completion_call_id': None, 'metadata': {'requester_metadata': {}, 'user_api_key_hash': '[REDACTED]', 'user_api_key_alias': None, 'user_api_key_team_id': None, 'user_api_key_user_id': '[REDACTED_USER_ID]', 'user_api_key_org_id': None, 'user_api_key_team_alias': None, 'user_api_key': '[REDACTED]', 'user_api_end_user_max_budget': None, 'litellm_api_version': '1.51.3', 'global_max_parallel_requests': None, 'user_api_key_team_max_budget': None, 'user_api_key_team_spend': None, 'user_api_key_spend': 0.0, 'user_api_key_max_budget': None, 'user_api_key_metadata': {}, 'headers': {'content-type': 'application/json', 'user-agent': '[REDACTED]', 'accept': '*/*', 'cache-control': 'no-cache', 'host': '[REDACTED_HOST]', 'accept-encoding': 'gzip, deflate, br', 'connection': 'keep-alive', 'content-length': '348'}, 'endpoint': '[REDACTED_ENDPOINT]', 'litellm_parent_otel_span': None, 'requester_ip_address': '', 'model_group': 'gpt4o-mini', 'model_group_size': 1, 'deployment': 'azure/gpt-4o-mini', 'model_info': {'id': '[REDACTED_MODEL_ID]', 'db_model': False, 'base_model': 'azure/gpt-4o-mini'}, 'api_base': '[REDACTED_API_BASE]', 'caching_groups': None}, 'model_info': {'id': '[REDACTED_MODEL_ID]', 'db_model': False, 'base_model': 'azure/gpt-4o-mini'}, 'proxy_server_request': {'url': '[REDACTED_ENDPOINT]', 'method': 'POST', 'headers': {'content-type': 'application/json', 'user-agent': '[REDACTED]', 'accept': '*/*', 'cache-control': 'no-cache', 'host': '[REDACTED_HOST]', 'accept-encoding': 'gzip, deflate, br', 'connection': 'keep-alive', 'content-length': '348'}, 'body': {'model': 'gpt4o-mini', 'user': '[REDACTED_USER]', 'stream': True, 'stream_options': {'include_usage': True}, 'messages': [{'role': 'system', 'content': 'You are a snarky assistant.'}, {'role': 'user', 'content': 'tell me a joke'}]}}, 'preset_cache_key': '[REDACTED_CACHE_KEY]', 'no-log': False, 'stream_response': {}, 'input_cost_per_token': None, 'input_cost_per_second': None, 'output_cost_per_token': None, 'output_cost_per_second': None, 'cooldown_time': None, 'text_completion': None, 'azure_ad_token_provider': None, 'user_continue_message': None, 'base_model': 'azure/gpt-4o-mini'}, 'start_time': datetime.datetime(2024, 11, 6, 22, 51, 2, 861890), 'stream': True, 'user': '[REDACTED_USER]', 'call_type': 'acompletion', 'litellm_call_id': '[REDACTED_CALL_ID]', 'completion_start_time': datetime.datetime(2024, 11, 6, 22, 51, 3, 691795), 'standard_callback_dynamic_params': {}, 'stream_options': {'include_usage': True}, 'extra_body': {}, 'custom_llm_provider': 'azure', 'input': [{'role': 'system', 'content': 'You are a snarky assistant.'}, {'role': 'user', 'content': 'tell me a joke'}], 'api_key': '[REDACTED_API_KEY]', 'original_response': <coroutine object AzureChatCompletion.async_streaming at [REDACTED_MEMORY_ADDRESS]>, 'additional_args': {'headers': {'api_key': '[REDACTED_API_KEY]', 'azure_ad_token': None}, 'api_base': ParseResult(scheme='https', userinfo='', host='[REDACTED_HOST]', port=None, path='/openai/', query=None, fragment=None), 'acompletion': True, 'complete_input_dict': {'model': 'gpt4o-mini', 'messages': [{'role': 'system', 'content': 'You are a snarky assistant.'}, {'role': 'user', 'content': 'tell me a joke'}], 'stream': True, 'stream_options': {'include_usage': True}, 'user': '[REDACTED_USER]', 'extra_body': {}}}, 'log_event_type': 'post_api_call', 'api_call_start_time': datetime.datetime(2024, 11, 6, 22, 51, 2, 871645), 'response_headers': {'transfer-encoding': 'chunked', 'content-type': 'text/event-stream; charset=utf-8', 'apim-request-id': '[REDACTED_REQUEST_ID]', 'strict-transport-security': 'max-age=31536000; includeSubDomains; preload', 'x-ratelimit-remaining-requests': '9998', 'x-accel-buffering': 'no', 'x-ms-rai-invoked': 'true', 'x-request-id': '[REDACTED_REQUEST_ID]', 'x-content-type-options': 'nosniff', 'azureml-model-session': '[REDACTED_SESSION]', 'x-ms-region': 'East US', 'x-envoy-upstream-service-time': '109', 'x-ms-client-request-id': '[REDACTED_CLIENT_REQUEST_ID]', 'x-ratelimit-remaining-tokens': '998699', 'date': 'Wed, 06 Nov 2024 22:51:02 GMT'}, 'end_time': datetime.datetime(2024, 11, 6, 22, 51, 4, 463244), 'cache_hit': False, 'response_cost': [REDACTED_COST], 'complete_streaming_response': ModelResponse(id='', choices=[Choices(finish_reason='stop', index=0, message=Message(content='Ah, ...?', role='assistant', tool_calls=None, function_call=None))], created=[REDACTED_TIMESTAMP], model='', object='chat.completion', system_fingerprint=None, usage=Usage(completion_tokens=37, prompt_tokens=24, total_tokens=61, completion_tokens_details=None, prompt_tokens_details=None))}
22:51:04 - LiteLLM Proxy:DEBUG: proxy_server.py:766 - kwargs stream: True + complete streaming response: ModelResponse(id='', choices=[Choices(finish_reason='stop', index=0, message=Message(content='Ah, ...', tool_calls=None, function_call=None))], created=[REDACTED_TIMESTAMP], model='', object='chat.completion', system_fingerprint=None, usage=Usage(completion_tokens=37, prompt_tokens=24, total_tokens=61, completion_tokens_details=None, prompt_tokens_details=None))}
22:51:04 - LiteLLM Proxy:DEBUG: proxy_server.py:788 - user_api_key [REDACTED], prisma_client: <litellm.proxy.utils.PrismaClient object at [REDACTED_MEMORY_ADDRESS]>
22:51:04 - LiteLLM Proxy:DEBUG: proxy_server.py:893 - Enters prisma db call, response_cost: [REDACTED_COST], token: [REDACTED]; user_id: [REDACTED_USER_ID]; team_id: None
22:51:04 - LiteLLM Proxy:DEBUG: spend_tracking_utils.py:40 - SpendTable: get_logging_payload - kwargs: {'model': 'gpt-4o-mini', 'messages': [{'role': 'system', 'content': 'You are a snarky assistant.'}, {'role': 'user', 'content': 'tell me a joke'}], 'optional_params': {'stream': True, 'stream_options': {'include_usage': True}, 'user': '[REDACTED_USER]', 'extra_body': {}}, 'litellm_params': {'acompletion': True, 'api_key': '[REDACTED_API_KEY]', 'force_timeout': 600, 'logger_fn': None, 'verbose': False, 'custom_llm_provider': 'azure', 'api_base': '[REDACTED_API_BASE]', 'litellm_call_id': '[REDACTED_CALL_ID]', 'model_alias_map': {}, 'completion_call_id': None, 'metadata': {'requester_metadata': {}, 'user_api_key_hash': '[REDACTED]', 'user_api_key_alias': None, 'user_api_key_team_id': None, 'user_api_key_user_id': '[REDACTED_USER_ID]', 'user_api_key_org_id': None, 'user_api_key_team_alias': None, 'user_api_key': '[REDACTED]', 'user_api_end_user_max_budget': None, 'litellm_api_version': '1.51.3', 'global_max_parallel_requests': None, 'user_api_key_team_max_budget': None, 'user_api_key_team_spend': None, 'user_api_key_spend': 0.0, 'user_api_key_max_budget': None, 'user_api_key_metadata': {}, 'headers': {'content-type': 'application/json', 'user-agent': '[REDACTED]', 'accept': '*/*', 'cache-control': 'no-cache', 'host': '[REDACTED_HOST]', 'accept-encoding': 'gzip, deflate, br', 'connection': 'keep-alive', 'content-length': '348'}, 'endpoint': '[REDACTED_ENDPOINT]', 'litellm_parent_otel_span': None, 'requester_ip_address': '', 'model_group': 'gpt4o-mini', 'model_group_size': 1, 'deployment': 'azure/gpt-4o-mini', 'model_info': {'id': '[REDACTED_MODEL_ID]', 'db_model': False, 'base_model': 'azure/gpt-4o-mini'}, 'api_base': '[REDACTED_API_BASE]', 'caching_groups': None}, 'model_info': {'id': '[REDACTED_MODEL_ID]', 'db_model': False, 'base_model': 'azure/gpt-4o-mini'}, 'proxy_server_request': {'url': '[REDACTED_ENDPOINT]', 'method': 'POST', 'headers': {'content-type': 'application/json', 'user-agent': '[REDACTED]', 'accept': '*/*', 'cache-control': 'no-cache', 'host': '[REDACTED_HOST]', 'accept-encoding': 'gzip, deflate, br', 'connection': 'keep-alive', 'content-length': '348'}, 'body': {'model': 'gpt4o-mini', 'user': '[REDACTED_USER]', 'stream': True, 'stream_options': {'include_usage': True}, 'messages': [{'role': 'system', 'content': 'You are a snarky assistant.'}, {'role': 'user', 'content': 'tell me a joke'}]}}, 'preset_cache_key': '[REDACTED_CACHE_KEY]', 'no-log': False, 'stream_response': {}, 'input_cost_per_token': None, 'input_cost_per_second': None, 'output_cost_per_token': None, 'output_cost_per_second': None, 'cooldown_time': None, 'text_completion': None, 'azure_ad_token_provider': None, 'user_continue_message': None, 'base_model': 'azure/gpt-4o-mini'}}
22:51:04 - LiteLLM Proxy:DEBUG: spend_tracking_utils.py:48 - SpendTable: get_logging_payload - user_id: [REDACTED_USER_ID], team_id: None
22:51:04 - LiteLLM Proxy:DEBUG: spend_tracking_utils.py:84 - SpendTable: PrismaSession: User[REDACTED_USER_ID] - Prisma Tracking Table Creation Attempt.

Twitter / LinkedIn details

No response

@DamnDaniel7 DamnDaniel7 added the bug Something isn't working label Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant