Skip to content

Conversation

@WoosukKwon
Copy link
Collaborator

@WoosukKwon WoosukKwon commented Aug 3, 2025

This PR disables the response store for OpenAI Responses API.

While the response store is required for various features in Responses API (including background, cancel, and prev_response_id), it is currently not implemented properly in vllm. The current store causes a memory leak and is not persistent (i.e., the stored responses are lost when the server shuts down). Therefore, until we have a proper implementation, we disable this feature and only enable it under the VLLM_ENABLE_RESPONSES_API_STORE=1 env flag.

Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
@WoosukKwon WoosukKwon requested a review from aarnphm as a code owner August 3, 2025 04:48
@github-actions
Copy link

github-actions bot commented Aug 3, 2025

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly disables the response store by default to prevent memory leaks, controlled by a new environment variable. The changes are well-implemented across the codebase, including tests and configuration. I have one suggestion to improve API consistency by adjusting an HTTP status code for a new error response.

Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
@WoosukKwon WoosukKwon added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 3, 2025
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
@DarkLight1337 DarkLight1337 enabled auto-merge (squash) August 3, 2025 08:22
@vllm-bot vllm-bot merged commit 6d98843 into main Aug 3, 2025
43 of 45 checks passed
@vllm-bot vllm-bot deleted the woosuk/disable-responses-store branch August 3, 2025 11:04
npanpaliya pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Aug 6, 2025
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
jinzhen-lin pushed a commit to jinzhen-lin/vllm that referenced this pull request Aug 9, 2025
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com>
noamgat pushed a commit to noamgat/vllm that referenced this pull request Aug 9, 2025
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: Noam Gat <noamgat@gmail.com>
paulpak58 pushed a commit to paulpak58/vllm that referenced this pull request Aug 13, 2025
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: Paul Pak <paulpak58@gmail.com>
diegocastanibm pushed a commit to diegocastanibm/vllm that referenced this pull request Aug 15, 2025
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: Diego-Castan <diego.castan@ibm.com>
epwalsh pushed a commit to epwalsh/vllm that referenced this pull request Aug 28, 2025
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Aug 28, 2025
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

frontend ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants