Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memory leak when double invoking RestChannel.sendResponse #89873

Conversation

original-brownbear
Copy link
Member

When using the resource handling channel we must make sure that if we (by what is IMO a bug) try to double invoke it after having already sent a response (or tried to do so) we at least release the memory in the channel's outbound buffer. Otherwise we will leak any memory from it that was used to create the now failing to send RestResponse.

When using the resource handling channel we must make sure that
if we (by what is IMO a bug) try to double invoke it after
having already sent a response (or tried to do so) we at least
release the memory in the channel's outbound buffer.
Otherwise we will leak any memory from it that was used to create
the now failing to send `RestResponse`.
@elasticsearchmachine elasticsearchmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Sep 7, 2022
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@elasticsearchmachine
Copy link
Collaborator

Hi @original-brownbear, I've created a changelog YAML for you.

Copy link
Contributor

@Tim-Brooks Tim-Brooks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - was this happening somewhere?

@original-brownbear
Copy link
Member Author

original-brownbear commented Sep 7, 2022

Thanks Tim!

was this happening somewhere?

Jup there's some logging for this in Cloud logs ever since we moved to the Netty allocator for Rest responses and get leak detection (mostly from EQL where there's an obvious bug behind it that I'll open a fix for in a bit).

@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
7.17 Commit could not be cherrypicked due to conflicts
8.4

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 89873

original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Sep 7, 2022
…c#89873)

When using the resource handling channel we must make sure that
if we (by what is IMO a bug) try to double invoke it after
having already sent a response (or tried to do so) we at least
release the memory in the channel's outbound buffer.
Otherwise we will leak any memory from it that was used to create
the now failing to send `RestResponse`.
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Sep 7, 2022
…c#89873)

When using the resource handling channel we must make sure that
if we (by what is IMO a bug) try to double invoke it after
having already sent a response (or tried to do so) we at least
release the memory in the channel's outbound buffer.
Otherwise we will leak any memory from it that was used to create
the now failing to send `RestResponse`.
elasticsearchmachine pushed a commit that referenced this pull request Sep 7, 2022
#89881)

When using the resource handling channel we must make sure that
if we (by what is IMO a bug) try to double invoke it after
having already sent a response (or tried to do so) we at least
release the memory in the channel's outbound buffer.
Otherwise we will leak any memory from it that was used to create
the now failing to send `RestResponse`.
elasticsearchmachine pushed a commit that referenced this pull request Sep 7, 2022
#89885)

When using the resource handling channel we must make sure that
if we (by what is IMO a bug) try to double invoke it after
having already sent a response (or tried to do so) we at least
release the memory in the channel's outbound buffer.
Otherwise we will leak any memory from it that was used to create
the now failing to send `RestResponse`.
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Sep 8, 2022
* main: (175 commits)
  Fix integration test on Windows (elastic#89894)
  Avoiding the use of dynamic map keys in the cluster_formation results of the stable master health indicator (elastic#89842)
  Mute org.elasticsearch.tracing.apm.ApmIT.testCapturesTracesForHttpTraffic (elastic#89891)
  Fix typos in audit event types (elastic#89886)
  Synthetic _source: support histogram field (elastic#89833)
  [TSDB] Rename rollup public API to downsample  (elastic#89809)
  Format script values access (elastic#89780)
  [DOCS] Simplifies composite aggregation recommendation (elastic#89878)
  [DOCS] Update CCS compatibility matrix for 8.3 (elastic#88906)
  Fix memory leak when double invoking RestChannel.sendResponse (elastic#89873)
  [ML] Add processor autoscaling decider (elastic#89645)
  Update disk-usage.asciidoc (elastic#89709) (elastic#89874)
  Add allocation deciders in createComponents (elastic#89836)
  Mute flaky H3LatLonGeometryTest.testIndexPoints (elastic#89870)
  Fix typo in get-snapshot-status-api doc (elastic#89865)
  Picking master eligible node at random in the master stability health indicator (elastic#89841)
  Do not reuse the client after a disruption elastic#89815 (elastic#89866)
  [ML] Distribute trained model allocations across availability zones (elastic#89822)
  Increment clientCalledCount before onResponse (elastic#89858)
  AwaitsFix for elastic#89867
  ...
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Sep 8, 2022
* main: (175 commits)
  Fix integration test on Windows (elastic#89894)
  Avoiding the use of dynamic map keys in the cluster_formation results of the stable master health indicator (elastic#89842)
  Mute org.elasticsearch.tracing.apm.ApmIT.testCapturesTracesForHttpTraffic (elastic#89891)
  Fix typos in audit event types (elastic#89886)
  Synthetic _source: support histogram field (elastic#89833)
  [TSDB] Rename rollup public API to downsample  (elastic#89809)
  Format script values access (elastic#89780)
  [DOCS] Simplifies composite aggregation recommendation (elastic#89878)
  [DOCS] Update CCS compatibility matrix for 8.3 (elastic#88906)
  Fix memory leak when double invoking RestChannel.sendResponse (elastic#89873)
  [ML] Add processor autoscaling decider (elastic#89645)
  Update disk-usage.asciidoc (elastic#89709) (elastic#89874)
  Add allocation deciders in createComponents (elastic#89836)
  Mute flaky H3LatLonGeometryTest.testIndexPoints (elastic#89870)
  Fix typo in get-snapshot-status-api doc (elastic#89865)
  Picking master eligible node at random in the master stability health indicator (elastic#89841)
  Do not reuse the client after a disruption elastic#89815 (elastic#89866)
  [ML] Distribute trained model allocations across availability zones (elastic#89822)
  Increment clientCalledCount before onResponse (elastic#89858)
  AwaitsFix for elastic#89867
  ...

# Conflicts:
#	x-pack/plugin/rollup/src/main/java/org/elasticsearch/xpack/downsample/RollupShardIndexer.java
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Sep 8, 2022
* main: (283 commits)
  Fix integration test on Windows (elastic#89894)
  Avoiding the use of dynamic map keys in the cluster_formation results of the stable master health indicator (elastic#89842)
  Mute org.elasticsearch.tracing.apm.ApmIT.testCapturesTracesForHttpTraffic (elastic#89891)
  Fix typos in audit event types (elastic#89886)
  Synthetic _source: support histogram field (elastic#89833)
  [TSDB] Rename rollup public API to downsample  (elastic#89809)
  Format script values access (elastic#89780)
  [DOCS] Simplifies composite aggregation recommendation (elastic#89878)
  [DOCS] Update CCS compatibility matrix for 8.3 (elastic#88906)
  Fix memory leak when double invoking RestChannel.sendResponse (elastic#89873)
  [ML] Add processor autoscaling decider (elastic#89645)
  Update disk-usage.asciidoc (elastic#89709) (elastic#89874)
  Add allocation deciders in createComponents (elastic#89836)
  Mute flaky H3LatLonGeometryTest.testIndexPoints (elastic#89870)
  Fix typo in get-snapshot-status-api doc (elastic#89865)
  Picking master eligible node at random in the master stability health indicator (elastic#89841)
  Do not reuse the client after a disruption elastic#89815 (elastic#89866)
  [ML] Distribute trained model allocations across availability zones (elastic#89822)
  Increment clientCalledCount before onResponse (elastic#89858)
  AwaitsFix for elastic#89867
  ...
@DaveCTurner
Copy link
Contributor

... IMO a bug) try to double invoke it

LGTM2, and yes this sounds like a bug to me too. My only question is whether we could assert that this doesn't happen (ofc fixing any cases where it does first).

@original-brownbear
Copy link
Member Author

My only question is whether we could assert that this doesn't happen

It's not entirely trivial unfortunately because of the current tests we have, but yes we should try to move towards that assertion.

@DaveCTurner
Copy link
Contributor

Ok, I opened #89902 to track that.

original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Sep 8, 2022
…ticationAction

This fixes an obvious bug where the listener was resolved twice if any of the first
two failure conditions in the changed method were met.
Prior to elastic#89873 this would lead to a memory leak.
original-brownbear added a commit that referenced this pull request Sep 9, 2022
…ticationAction (#89930)

This fixes an obvious bug where the listener was resolved twice if any of the first
two failure conditions in the changed method were met.
Prior to #89873 this would lead to a memory leak.
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Sep 9, 2022
…ticationAction (elastic#89930)

This fixes an obvious bug where the listener was resolved twice if any of the first
two failure conditions in the changed method were met.
Prior to elastic#89873 this would lead to a memory leak.
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Sep 9, 2022
…ticationAction (elastic#89930)

This fixes an obvious bug where the listener was resolved twice if any of the first
two failure conditions in the changed method were met.
Prior to elastic#89873 this would lead to a memory leak.
elasticsearchmachine pushed a commit that referenced this pull request Sep 9, 2022
…ticationAction (#89930) (#89954)

This fixes an obvious bug where the listener was resolved twice if any of the first
two failure conditions in the changed method were met.
Prior to #89873 this would lead to a memory leak.
elasticsearchmachine pushed a commit that referenced this pull request Sep 9, 2022
…eAuthenticationAction (#89930) (#89953)

* Fix double sending of response in TransportOpenIdConnectPrepareAuthenticationAction (#89930)

This fixes an obvious bug where the listener was resolved twice if any of the first
two failure conditions in the changed method were met.
Prior to #89873 this would lead to a memory leak.

* fix compile
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Coordination/Network Http and internode communication implementations Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v7.17.7 v8.4.2 v8.5.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants