Skip to content

Conversation

@howardjohn
Copy link
Contributor

I am seeing a bug where connections with a Streamable HTTP client are not reused at all. This makes performance terrible, since each request establishes a new TCP connection.

I was able to root cause this down to behavior of the processStream function. This sets up a bufio.NewScanner, and waits to get data. Once we get a message, we abort and close the connection.

At this point, there are two paths:

  • The bufio reader read the full response body to EOF, and then we closed. This case is fine; we can re-use the idle connection.
  • The bufio reader did not real the full response body. It may have read the entire actual data that would be sent, but since its streaming happened to not yet be delivered. In this case, we call body.Close() before an EOF, which causes the HTTP client to not attempt to re-use the connection.

In my case, the server I am using reliable sends the response over 2 TCP packets (one with the event, and the other with the closure of the chunked encoding (0)). This causes the body to never be fully read.

This PR fixes that case by unconditionally reading the full body before we close the connection. With this change, I am able to reliably re-use connections for future requests.

I am seeing a bug where connections with a Streamable HTTP client are
not reused at all. This makes performance terrible, since each request
establishes a new TCP connection.

I was able to root cause this down to behavior of the `processStream`
function. This sets up a bufio.NewScanner, and waits to get data. Once
we get a message, we abort and *close the connection*.

At this point, there are two paths:
* The bufio reader read the *full* response body to EOF, and then we
  closed. This case is fine; we can re-use the idle connection.
* The bufio reader *did not* real the full response body. It may have
  read the entire actual data that *would be sent*, but since its
streaming happened to not yet be delivered. In this case, we call
`body.Close()` before an EOF, which causes the HTTP client to not
attempt to re-use the connection.

In my case, the server I am using reliable sends the response over 2 TCP
packets (one with the event, and the other with the closure of the
`chunked` encoding (`0`)). This causes the body to never be fully read.

This PR fixes that case by unconditionally reading the full body before
we close the connection. With this change, I am able to reliably re-use
connections for future requests.
@findleyr
Copy link
Contributor

Thanks for lending your expertise to this problem, and for the clear description of the issue!

Reviewing.

@findleyr findleyr merged commit b012d08 into modelcontextprotocol:main Dec 10, 2025
5 checks passed
howardjohn added a commit to howardjohn/ai-gateway that referenced this pull request Dec 10, 2025
This fixes an issue in the benchmark where the context is cancelled
before the request processing has finished. This causes Go to consider
the connection as un-reusable, leading to 0% connection re-use. This
only impacts servers that return data across multiple TCP packets (which
is 100% legal thing to do), hence why it is not always showing up.

Additionally, pulls in
modelcontextprotocol/go-sdk#709.

Signed-off-by: John Howard <john.howard@solo.io>
nacx added a commit to envoyproxy/ai-gateway that referenced this pull request Dec 15, 2025
**Description**

This fixes an issue in the benchmark where the context is cancelled
before the request processing has finished. This causes Go to consider
the connection as un-reusable, leading to 0% connection re-use. This
only impacts servers that return data across multiple TCP packets (which
is 100% legal thing to do), hence why it is not always showing up.

With this change:
```
name                     time/op
MCP/Baseline_NoProxy-32  44.5µs ± 3%
MCP/Agent_Gateway-32     84.4µs ± 1%
MCP/EAIGW_EAIGW_Default  21.3ms ± 0%
MCP/EAIGW_Config_100-32   257µs ± 0%
```

Before this change:
```
name                  time/op
MCP/Agent_Gateway-32   183µs ± 3%
```

Additionally, pulls in
modelcontextprotocol/go-sdk#709 which is also
required to fix this.

:heart:

Signed-off-by: John Howard <john.howard@solo.io>
Co-authored-by: Ignasi Barrera <ignasi@tetrate.io>
Aias00 pushed a commit to Aias00/ai-gateway that referenced this pull request Dec 15, 2025
**Description**

This fixes an issue in the benchmark where the context is cancelled
before the request processing has finished. This causes Go to consider
the connection as un-reusable, leading to 0% connection re-use. This
only impacts servers that return data across multiple TCP packets (which
is 100% legal thing to do), hence why it is not always showing up.

With this change:
```
name                     time/op
MCP/Baseline_NoProxy-32  44.5µs ± 3%
MCP/Agent_Gateway-32     84.4µs ± 1%
MCP/EAIGW_EAIGW_Default  21.3ms ± 0%
MCP/EAIGW_Config_100-32   257µs ± 0%
```

Before this change:
```
name                  time/op
MCP/Agent_Gateway-32   183µs ± 3%
```

Additionally, pulls in
modelcontextprotocol/go-sdk#709 which is also
required to fix this.

:heart:

Signed-off-by: John Howard <john.howard@solo.io>
Co-authored-by: Ignasi Barrera <ignasi@tetrate.io>
Aias00 pushed a commit to Aias00/ai-gateway that referenced this pull request Dec 15, 2025
**Description**

This fixes an issue in the benchmark where the context is cancelled
before the request processing has finished. This causes Go to consider
the connection as un-reusable, leading to 0% connection re-use. This
only impacts servers that return data across multiple TCP packets (which
is 100% legal thing to do), hence why it is not always showing up.

With this change:
```
name                     time/op
MCP/Baseline_NoProxy-32  44.5µs ± 3%
MCP/Agent_Gateway-32     84.4µs ± 1%
MCP/EAIGW_EAIGW_Default  21.3ms ± 0%
MCP/EAIGW_Config_100-32   257µs ± 0%
```

Before this change:
```
name                  time/op
MCP/Agent_Gateway-32   183µs ± 3%
```

Additionally, pulls in
modelcontextprotocol/go-sdk#709 which is also
required to fix this.

:heart:

Signed-off-by: John Howard <john.howard@solo.io>
Co-authored-by: Ignasi Barrera <ignasi@tetrate.io>
Signed-off-by: liuhy <liuhongyu@apache.org>
missBerg pushed a commit to missBerg/ai-gateway that referenced this pull request Dec 20, 2025
**Description**

This fixes an issue in the benchmark where the context is cancelled
before the request processing has finished. This causes Go to consider
the connection as un-reusable, leading to 0% connection re-use. This
only impacts servers that return data across multiple TCP packets (which
is 100% legal thing to do), hence why it is not always showing up.

With this change:
```
name                     time/op
MCP/Baseline_NoProxy-32  44.5µs ± 3%
MCP/Agent_Gateway-32     84.4µs ± 1%
MCP/EAIGW_EAIGW_Default  21.3ms ± 0%
MCP/EAIGW_Config_100-32   257µs ± 0%
```

Before this change:
```
name                  time/op
MCP/Agent_Gateway-32   183µs ± 3%
```

Additionally, pulls in
modelcontextprotocol/go-sdk#709 which is also
required to fix this.

:heart:

Signed-off-by: John Howard <john.howard@solo.io>
Co-authored-by: Ignasi Barrera <ignasi@tetrate.io>
Signed-off-by: Erica Hughberg <erica.sundberg.90@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants