-
Notifications
You must be signed in to change notification settings - Fork 334
mcp: allow re-using connections in more cases #709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
findleyr
merged 1 commit into
modelcontextprotocol:main
from
howardjohn:perf/allow-reuse
Dec 10, 2025
Merged
mcp: allow re-using connections in more cases #709
findleyr
merged 1 commit into
modelcontextprotocol:main
from
howardjohn:perf/allow-reuse
Dec 10, 2025
+5
−1
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I am seeing a bug where connections with a Streamable HTTP client are not reused at all. This makes performance terrible, since each request establishes a new TCP connection. I was able to root cause this down to behavior of the `processStream` function. This sets up a bufio.NewScanner, and waits to get data. Once we get a message, we abort and *close the connection*. At this point, there are two paths: * The bufio reader read the *full* response body to EOF, and then we closed. This case is fine; we can re-use the idle connection. * The bufio reader *did not* real the full response body. It may have read the entire actual data that *would be sent*, but since its streaming happened to not yet be delivered. In this case, we call `body.Close()` before an EOF, which causes the HTTP client to not attempt to re-use the connection. In my case, the server I am using reliable sends the response over 2 TCP packets (one with the event, and the other with the closure of the `chunked` encoding (`0`)). This causes the body to never be fully read. This PR fixes that case by unconditionally reading the full body before we close the connection. With this change, I am able to reliably re-use connections for future requests.
Contributor
|
Thanks for lending your expertise to this problem, and for the clear description of the issue! Reviewing. |
findleyr
approved these changes
Dec 10, 2025
howardjohn
added a commit
to howardjohn/ai-gateway
that referenced
this pull request
Dec 10, 2025
This fixes an issue in the benchmark where the context is cancelled before the request processing has finished. This causes Go to consider the connection as un-reusable, leading to 0% connection re-use. This only impacts servers that return data across multiple TCP packets (which is 100% legal thing to do), hence why it is not always showing up. Additionally, pulls in modelcontextprotocol/go-sdk#709. Signed-off-by: John Howard <john.howard@solo.io>
nacx
added a commit
to envoyproxy/ai-gateway
that referenced
this pull request
Dec 15, 2025
**Description** This fixes an issue in the benchmark where the context is cancelled before the request processing has finished. This causes Go to consider the connection as un-reusable, leading to 0% connection re-use. This only impacts servers that return data across multiple TCP packets (which is 100% legal thing to do), hence why it is not always showing up. With this change: ``` name time/op MCP/Baseline_NoProxy-32 44.5µs ± 3% MCP/Agent_Gateway-32 84.4µs ± 1% MCP/EAIGW_EAIGW_Default 21.3ms ± 0% MCP/EAIGW_Config_100-32 257µs ± 0% ``` Before this change: ``` name time/op MCP/Agent_Gateway-32 183µs ± 3% ``` Additionally, pulls in modelcontextprotocol/go-sdk#709 which is also required to fix this. :heart: Signed-off-by: John Howard <john.howard@solo.io> Co-authored-by: Ignasi Barrera <ignasi@tetrate.io>
Aias00
pushed a commit
to Aias00/ai-gateway
that referenced
this pull request
Dec 15, 2025
**Description** This fixes an issue in the benchmark where the context is cancelled before the request processing has finished. This causes Go to consider the connection as un-reusable, leading to 0% connection re-use. This only impacts servers that return data across multiple TCP packets (which is 100% legal thing to do), hence why it is not always showing up. With this change: ``` name time/op MCP/Baseline_NoProxy-32 44.5µs ± 3% MCP/Agent_Gateway-32 84.4µs ± 1% MCP/EAIGW_EAIGW_Default 21.3ms ± 0% MCP/EAIGW_Config_100-32 257µs ± 0% ``` Before this change: ``` name time/op MCP/Agent_Gateway-32 183µs ± 3% ``` Additionally, pulls in modelcontextprotocol/go-sdk#709 which is also required to fix this. :heart: Signed-off-by: John Howard <john.howard@solo.io> Co-authored-by: Ignasi Barrera <ignasi@tetrate.io>
Aias00
pushed a commit
to Aias00/ai-gateway
that referenced
this pull request
Dec 15, 2025
**Description** This fixes an issue in the benchmark where the context is cancelled before the request processing has finished. This causes Go to consider the connection as un-reusable, leading to 0% connection re-use. This only impacts servers that return data across multiple TCP packets (which is 100% legal thing to do), hence why it is not always showing up. With this change: ``` name time/op MCP/Baseline_NoProxy-32 44.5µs ± 3% MCP/Agent_Gateway-32 84.4µs ± 1% MCP/EAIGW_EAIGW_Default 21.3ms ± 0% MCP/EAIGW_Config_100-32 257µs ± 0% ``` Before this change: ``` name time/op MCP/Agent_Gateway-32 183µs ± 3% ``` Additionally, pulls in modelcontextprotocol/go-sdk#709 which is also required to fix this. :heart: Signed-off-by: John Howard <john.howard@solo.io> Co-authored-by: Ignasi Barrera <ignasi@tetrate.io> Signed-off-by: liuhy <liuhongyu@apache.org>
missBerg
pushed a commit
to missBerg/ai-gateway
that referenced
this pull request
Dec 20, 2025
**Description** This fixes an issue in the benchmark where the context is cancelled before the request processing has finished. This causes Go to consider the connection as un-reusable, leading to 0% connection re-use. This only impacts servers that return data across multiple TCP packets (which is 100% legal thing to do), hence why it is not always showing up. With this change: ``` name time/op MCP/Baseline_NoProxy-32 44.5µs ± 3% MCP/Agent_Gateway-32 84.4µs ± 1% MCP/EAIGW_EAIGW_Default 21.3ms ± 0% MCP/EAIGW_Config_100-32 257µs ± 0% ``` Before this change: ``` name time/op MCP/Agent_Gateway-32 183µs ± 3% ``` Additionally, pulls in modelcontextprotocol/go-sdk#709 which is also required to fix this. :heart: Signed-off-by: John Howard <john.howard@solo.io> Co-authored-by: Ignasi Barrera <ignasi@tetrate.io> Signed-off-by: Erica Hughberg <erica.sundberg.90@gmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I am seeing a bug where connections with a Streamable HTTP client are not reused at all. This makes performance terrible, since each request establishes a new TCP connection.
I was able to root cause this down to behavior of the
processStreamfunction. This sets up a bufio.NewScanner, and waits to get data. Once we get a message, we abort and close the connection.At this point, there are two paths:
body.Close()before an EOF, which causes the HTTP client to not attempt to re-use the connection.In my case, the server I am using reliable sends the response over 2 TCP packets (one with the event, and the other with the closure of the
chunkedencoding (0)). This causes the body to never be fully read.This PR fixes that case by unconditionally reading the full body before we close the connection. With this change, I am able to reliably re-use connections for future requests.