-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix memory leak when double invoking RestChannel.sendResponse #89873
Fix memory leak when double invoking RestChannel.sendResponse #89873
Conversation
When using the resource handling channel we must make sure that if we (by what is IMO a bug) try to double invoke it after having already sent a response (or tried to do so) we at least release the memory in the channel's outbound buffer. Otherwise we will leak any memory from it that was used to create the now failing to send `RestResponse`.
Pinging @elastic/es-distributed (Team:Distributed) |
Hi @original-brownbear, I've created a changelog YAML for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - was this happening somewhere?
Thanks Tim!
Jup there's some logging for this in Cloud logs ever since we moved to the Netty allocator for Rest responses and get leak detection (mostly from EQL where there's an obvious bug behind it that I'll open a fix for in a bit). |
💔 Backport failed
You can use sqren/backport to manually backport by running |
…c#89873) When using the resource handling channel we must make sure that if we (by what is IMO a bug) try to double invoke it after having already sent a response (or tried to do so) we at least release the memory in the channel's outbound buffer. Otherwise we will leak any memory from it that was used to create the now failing to send `RestResponse`.
…c#89873) When using the resource handling channel we must make sure that if we (by what is IMO a bug) try to double invoke it after having already sent a response (or tried to do so) we at least release the memory in the channel's outbound buffer. Otherwise we will leak any memory from it that was used to create the now failing to send `RestResponse`.
#89881) When using the resource handling channel we must make sure that if we (by what is IMO a bug) try to double invoke it after having already sent a response (or tried to do so) we at least release the memory in the channel's outbound buffer. Otherwise we will leak any memory from it that was used to create the now failing to send `RestResponse`.
#89885) When using the resource handling channel we must make sure that if we (by what is IMO a bug) try to double invoke it after having already sent a response (or tried to do so) we at least release the memory in the channel's outbound buffer. Otherwise we will leak any memory from it that was used to create the now failing to send `RestResponse`.
* main: (175 commits) Fix integration test on Windows (elastic#89894) Avoiding the use of dynamic map keys in the cluster_formation results of the stable master health indicator (elastic#89842) Mute org.elasticsearch.tracing.apm.ApmIT.testCapturesTracesForHttpTraffic (elastic#89891) Fix typos in audit event types (elastic#89886) Synthetic _source: support histogram field (elastic#89833) [TSDB] Rename rollup public API to downsample (elastic#89809) Format script values access (elastic#89780) [DOCS] Simplifies composite aggregation recommendation (elastic#89878) [DOCS] Update CCS compatibility matrix for 8.3 (elastic#88906) Fix memory leak when double invoking RestChannel.sendResponse (elastic#89873) [ML] Add processor autoscaling decider (elastic#89645) Update disk-usage.asciidoc (elastic#89709) (elastic#89874) Add allocation deciders in createComponents (elastic#89836) Mute flaky H3LatLonGeometryTest.testIndexPoints (elastic#89870) Fix typo in get-snapshot-status-api doc (elastic#89865) Picking master eligible node at random in the master stability health indicator (elastic#89841) Do not reuse the client after a disruption elastic#89815 (elastic#89866) [ML] Distribute trained model allocations across availability zones (elastic#89822) Increment clientCalledCount before onResponse (elastic#89858) AwaitsFix for elastic#89867 ...
* main: (175 commits) Fix integration test on Windows (elastic#89894) Avoiding the use of dynamic map keys in the cluster_formation results of the stable master health indicator (elastic#89842) Mute org.elasticsearch.tracing.apm.ApmIT.testCapturesTracesForHttpTraffic (elastic#89891) Fix typos in audit event types (elastic#89886) Synthetic _source: support histogram field (elastic#89833) [TSDB] Rename rollup public API to downsample (elastic#89809) Format script values access (elastic#89780) [DOCS] Simplifies composite aggregation recommendation (elastic#89878) [DOCS] Update CCS compatibility matrix for 8.3 (elastic#88906) Fix memory leak when double invoking RestChannel.sendResponse (elastic#89873) [ML] Add processor autoscaling decider (elastic#89645) Update disk-usage.asciidoc (elastic#89709) (elastic#89874) Add allocation deciders in createComponents (elastic#89836) Mute flaky H3LatLonGeometryTest.testIndexPoints (elastic#89870) Fix typo in get-snapshot-status-api doc (elastic#89865) Picking master eligible node at random in the master stability health indicator (elastic#89841) Do not reuse the client after a disruption elastic#89815 (elastic#89866) [ML] Distribute trained model allocations across availability zones (elastic#89822) Increment clientCalledCount before onResponse (elastic#89858) AwaitsFix for elastic#89867 ... # Conflicts: # x-pack/plugin/rollup/src/main/java/org/elasticsearch/xpack/downsample/RollupShardIndexer.java
* main: (283 commits) Fix integration test on Windows (elastic#89894) Avoiding the use of dynamic map keys in the cluster_formation results of the stable master health indicator (elastic#89842) Mute org.elasticsearch.tracing.apm.ApmIT.testCapturesTracesForHttpTraffic (elastic#89891) Fix typos in audit event types (elastic#89886) Synthetic _source: support histogram field (elastic#89833) [TSDB] Rename rollup public API to downsample (elastic#89809) Format script values access (elastic#89780) [DOCS] Simplifies composite aggregation recommendation (elastic#89878) [DOCS] Update CCS compatibility matrix for 8.3 (elastic#88906) Fix memory leak when double invoking RestChannel.sendResponse (elastic#89873) [ML] Add processor autoscaling decider (elastic#89645) Update disk-usage.asciidoc (elastic#89709) (elastic#89874) Add allocation deciders in createComponents (elastic#89836) Mute flaky H3LatLonGeometryTest.testIndexPoints (elastic#89870) Fix typo in get-snapshot-status-api doc (elastic#89865) Picking master eligible node at random in the master stability health indicator (elastic#89841) Do not reuse the client after a disruption elastic#89815 (elastic#89866) [ML] Distribute trained model allocations across availability zones (elastic#89822) Increment clientCalledCount before onResponse (elastic#89858) AwaitsFix for elastic#89867 ...
LGTM2, and yes this sounds like a bug to me too. My only question is whether we could assert that this doesn't happen (ofc fixing any cases where it does first). |
It's not entirely trivial unfortunately because of the current tests we have, but yes we should try to move towards that assertion. |
Ok, I opened #89902 to track that. |
…ticationAction This fixes an obvious bug where the listener was resolved twice if any of the first two failure conditions in the changed method were met. Prior to elastic#89873 this would lead to a memory leak.
…ticationAction (elastic#89930) This fixes an obvious bug where the listener was resolved twice if any of the first two failure conditions in the changed method were met. Prior to elastic#89873 this would lead to a memory leak.
…ticationAction (elastic#89930) This fixes an obvious bug where the listener was resolved twice if any of the first two failure conditions in the changed method were met. Prior to elastic#89873 this would lead to a memory leak.
…eAuthenticationAction (#89930) (#89953) * Fix double sending of response in TransportOpenIdConnectPrepareAuthenticationAction (#89930) This fixes an obvious bug where the listener was resolved twice if any of the first two failure conditions in the changed method were met. Prior to #89873 this would lead to a memory leak. * fix compile
When using the resource handling channel we must make sure that if we (by what is IMO a bug) try to double invoke it after having already sent a response (or tried to do so) we at least release the memory in the channel's outbound buffer. Otherwise we will leak any memory from it that was used to create the now failing to send
RestResponse
.