[Serve.llm] Support OpenAI Responses (Stateful) API

### Description

The OpenAI [Responses API](https://platform.openai.com/docs/api-reference/responses) is a new stateful interface that is adds a few features on top of the previous stateless APIs (e.g. `/chat/completions`, etc). 

Guidance from OpenAI is to build new projects using this API instead of chat/completions - but they ensure (for now) indefinite support for chat/completions. They _are_ sunsetting the [Assistants API](https://platform.openai.com/docs/api-reference/assistants) which the new Responses API purportedly replaces sufficiently. 

The main benefit of this new endpoint is for simplifying workflows that involve tool use, code execution, and **state management**. This (similar to the Assistants API) happens server-side - unlike the Chat Completions API where the client has to maintain state and send it back and forth in each request/prompt. 

By using `"store": true` with the new API (and follow-up messages including a `"previous_response_id": response_id`, conversations become stateful. 

**Current State**
* Some initial support for OpenAI Responses API has been [merged into VLLM](https://github.com/vllm-project/vllm/pull/20504)
* Ongoing work on fully supporting Responses on the [VLLM side](https://github.com/vllm-project/vllm/pulls?q=in%3Atitle+%22%5BResponses+API%5D%22+)
* VLLM Discussion [Thread](https://github.com/vllm-project/vllm/issues/14721) about `/responses` and issues with supporting a stateful API
* Migration guide from chat/completions -> responses with code examples from [OpenAI blogpost](https://platform.openai.com/docs/guides/migrate-to-responses)
* Context on Harmony Response [Format (OpenAI OSS)](https://cookbook.openai.com/articles/openai-harmony)
* Blog on the main differences by [Simon Willison](https://simonwillison.net/2025/Mar/11/responses-vs-chat-completions/?utm_source=chatgpt.com)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Serve.llm] Support OpenAI Responses (Stateful) API #55631

Description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Serve.llm] Support OpenAI Responses (Stateful) API #55631

Description

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions