Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track usage for OpenAI models even when streaming #591

Closed
simonw opened this issue Oct 29, 2024 · 2 comments
Closed

Track usage for OpenAI models even when streaming #591

simonw opened this issue Oct 29, 2024 · 2 comments
Labels
enhancement New feature or request openai

Comments

@simonw
Copy link
Owner

simonw commented Oct 29, 2024

OpenAI used to not return usage information in streams, but now they do:

https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options

CleanShot 2024-10-28 at 17 35 12@2x

@simonw simonw added enhancement New feature or request openai labels Oct 29, 2024
@simonw
Copy link
Owner Author

simonw commented Oct 29, 2024

Testing my implementation manually:

llm -m gpt-4o-mini hi
llm logs -c --json
[
  {
    "id": "01jbav6h7k7gg9f486p9n0nw58",
    "model": "gpt-4o-mini",
    "prompt": "hi",
    "system": null,
    "prompt_json": {
      "messages": [
        {
          "role": "user",
          "content": "hi"
        }
      ]
    },
    "options_json": {},
    "response": "Hello! How can I assist you today?",
    "response_json": {
      "content": "Hello! How can I assist you today?",
      "finish_reason": "stop",
      "usage": {
        "completion_tokens": 9,
        "prompt_tokens": 8,
        "total_tokens": 17,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 0
        }
      },
      "id": "chatcmpl-ANUV2zARjPvOzBEJoeJAHMZztmyMA",
      "object": "chat.completion.chunk",
      "model": "gpt-4o-mini-2024-07-18",
      "created": 1730162148
    },
    "conversation_id": "01jbav6h7h5ek2fyykq19kb56y",
    "duration_ms": 1120,
    "datetime_utc": "2024-10-29T00:35:47.471634",
    "conversation_name": "hi",
    "conversation_model": "gpt-4o-mini",
    "attachments": []
  }
]

With --no-stream:

llm -m gpt-4o-mini hi --no-stream
[
  {
    "id": "01jbav9cg9ecktc06rtvmdm0gx",
    "model": "gpt-4o-mini",
    "prompt": "hi",
    "system": null,
    "prompt_json": {
      "messages": [
        {
          "role": "user",
          "content": "hi"
        }
      ]
    },
    "options_json": {},
    "response": "Hello! How can I assist you today?",
    "response_json": {
      "id": "chatcmpl-ANUWXkIeaElMNKblDP83pT9qk9t0m",
      "choices": [
        {
          "finish_reason": "stop",
          "index": 0,
          "message": {
            "content": "Hello! How can I assist you today?",
            "role": "assistant"
          }
        }
      ],
      "created": 1730162241,
      "model": "gpt-4o-mini-2024-07-18",
      "object": "chat.completion",
      "system_fingerprint": "fp_f59a81427f",
      "usage": {
        "completion_tokens": 9,
        "prompt_tokens": 8,
        "total_tokens": 17,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 0
        }
      }
    },
    "conversation_id": "01jbav9cg7qntkka2hvca72yhd",
    "duration_ms": 603,
    "datetime_utc": "2024-10-29T00:37:21.450264",
    "conversation_name": "hi",
    "conversation_model": "gpt-4o-mini",
    "attachments": []
  }
]

For a completion model:

llm -m gpt-3.5-turbo-instruct 'capital of france is '
[
  {
    "id": "01jbavbrwsrtwgt5snjz6t7fgd",
    "model": "gpt-3.5-turbo-instruct",
    "prompt": "capital of france is ",
    "system": null,
    "prompt_json": {
      "messages": [
        "capital of france is "
      ]
    },
    "options_json": {},
    "response": " Paris\n\n",
    "response_json": {
      "content": " Paris\n\n",
      "usage": {
        "completion_tokens": 2,
        "prompt_tokens": 5,
        "total_tokens": 7
      },
      "id": "cmpl-ANUXorptC5yuMOMLF1Z4JL2pCz2DF",
      "object": "text_completion",
      "model": "gpt-3.5-turbo-instruct",
      "created": 1730162320
    },
    "conversation_id": "01jbavbrwqfp0b0t22x9tk71at",
    "duration_ms": 633,
    "datetime_utc": "2024-10-29T00:38:39.644058",
    "conversation_name": "capital of france is ",
    "conversation_model": "gpt-3.5-turbo-instruct",
    "attachments": []
  }
]

And that with --no-stream:

[
  {
    "id": "01jbavcgc4f40zy4ynqnvbmv24",
    "model": "gpt-3.5-turbo-instruct",
    "prompt": "capital of france is ",
    "system": null,
    "prompt_json": {
      "messages": [
        "capital of france is "
      ]
    },
    "options_json": {},
    "response": " paris\n\n\nThe capital of France is indeed Paris. It is located in the northern part of the country and is known for its historic landmarks, art, fashion, and cuisine. It is also a major global city, with a population of over 2 million people. The Eiffel Tower, Notre-Dame Cathedral, and the Louvre Museum are some of its most famous attractions. Paris is also known as the \"City of Love\" and is a popular tourist destination.",
    "response_json": {
      "id": "cmpl-ANUYBosUNuPgLZJZF20dxAHKUhN8b",
      "choices": [
        {
          "finish_reason": "stop",
          "index": 0,
          "text": " paris\n\n\nThe capital of France is indeed Paris. It is located in the northern part of the country and is known for its historic landmarks, art, fashion, and cuisine. It is also a major global city, with a population of over 2 million people. The Eiffel Tower, Notre-Dame Cathedral, and the Louvre Museum are some of its most famous attractions. Paris is also known as the \"City of Love\" and is a popular tourist destination."
        }
      ],
      "created": 1730162343,
      "model": "gpt-3.5-turbo-instruct",
      "object": "text_completion",
      "usage": {
        "completion_tokens": 96,
        "prompt_tokens": 5,
        "total_tokens": 101
      }
    },
    "conversation_id": "01jbavcgc2y0vm77r63q0pz9x1",
    "duration_ms": 1643,
    "datetime_utc": "2024-10-29T00:39:02.677491",
    "conversation_name": "capital of france is ",
    "conversation_model": "gpt-3.5-turbo-instruct",
    "attachments": []
  }
]

@simonw
Copy link
Owner Author

simonw commented Oct 29, 2024

The LLM_OPENAI_SHOW_RESPONSES=1 option was useful here too:

LLM_OPENAI_SHOW_RESPONSES=1 llm -m gpt-3.5-turbo-instruct 'capital of france is '
Request: POST https://api.openai.com/v1/completions
  Headers:
    host: api.openai.com
    connection: keep-alive
    accept: application/json
    content-type: application/json
    user-agent: OpenAI/Python 1.37.0
    x-stainless-lang: python
    x-stainless-package-version: 1.37.0
    x-stainless-os: MacOS
    x-stainless-arch: arm64
    x-stainless-runtime: CPython
    x-stainless-runtime-version: 3.10.4
    authorization: [...]
    x-stainless-async: false
    content-length: 148
  Body:
    {
      "model": "gpt-3.5-turbo-instruct",
      "prompt": "capital of france is ",
      "max_tokens": 256,
      "stream": true,
      "stream_options": {
        "include_usage": true
      }
    }
Response: status_code=200
  Headers:
    date: Tue, 29 Oct 2024 00:39:39 GMT
    content-type: text/event-stream
    transfer-encoding: chunked
    connection: keep-alive
    access-control-allow-origin: *
    access-control-expose-headers: X-Request-ID
    cache-control: no-cache, must-revalidate
    openai-model: gpt-3.5-turbo-instruct
    openai-organization: user-r3e61fpak04cbaokp5buoae4
    openai-processing-ms: 281
    openai-version: 2020-10-01
    strict-transport-security: max-age=31536000; includeSubDomains; preload
    x-ratelimit-limit-requests: 3500
    x-ratelimit-limit-tokens: 90000
    x-ratelimit-remaining-requests: 3499
    x-ratelimit-remaining-tokens: 89739
    x-ratelimit-reset-requests: 17ms
    x-ratelimit-reset-tokens: 174ms
    x-request-id: req_8765615cd073205516a763fe3c4ffc0f
    cf-cache-status: DYNAMIC
    set-cookie: __cf_bm=...
    x-content-type-options: nosniff
    server: cloudflare
    cf-ray: 8d9f1c1639c8cf4d-SJC
    alt-svc: h3=":443"; ma=86400
  Body:
data: {"id":"cmpl-ANUYlvpi4kFs3nOhNDuyO14WST2ww","object":"text_completion","created":1730162379,"choices":[{"text":"","index":0,"logprobs":null,"finish_reason":"stop"}],"model":"gpt-3.5-turbo-instruct","usage":null}

data: {"id":"cmpl-ANUYlvpi4kFs3nOhNDuyO14WST2ww","object":"text_completion","created":1730162379,"model":"gpt-3.5-turbo-instruct","usage":{"prompt_tokens":5,"total_tokens":5},"choices":[]}

data: [DONE]

@simonw simonw closed this as completed in 389acdf Oct 29, 2024
simonw added a commit that referenced this issue Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request openai
Projects
None yet
Development

No branches or pull requests

1 participant