-
Notifications
You must be signed in to change notification settings - Fork 6.9k
Description
Description
The OpenAI Responses API is a new stateful interface that is adds a few features on top of the previous stateless APIs (e.g. /chat/completions, etc).
Guidance from OpenAI is to build new projects using this API instead of chat/completions - but they ensure (for now) indefinite support for chat/completions. They are sunsetting the Assistants API which the new Responses API purportedly replaces sufficiently.
The main benefit of this new endpoint is for simplifying workflows that involve tool use, code execution, and state management. This (similar to the Assistants API) happens server-side - unlike the Chat Completions API where the client has to maintain state and send it back and forth in each request/prompt.
By using "store": true with the new API (and follow-up messages including a "previous_response_id": response_id, conversations become stateful.
Current State
- Some initial support for OpenAI Responses API has been merged into VLLM
- Ongoing work on fully supporting Responses on the VLLM side
- VLLM Discussion Thread about
/responsesand issues with supporting a stateful API - Migration guide from chat/completions -> responses with code examples from OpenAI blogpost
- Context on Harmony Response Format (OpenAI OSS)
- Blog on the main differences by Simon Willison