test: skip test-diagnostics-channel-memory-leak.js #50327

joyeecheung · 2023-10-22T09:29:08Z

test: skip test-diagnostics-channel-memory-leak.js

There is currently no reliable way to detect this leak because:

We cannot reliably get a reference to the channel from the
API to detect finalization without creating another strong reference.
This test does gc() and then checks memory usage - however the
use of gc() disables code aging which can actually lead to increased
memory usage overall, as it is not intended to be used to lower
memory usage in the first place.
The implementation of diagnostics channels relies on ephemeron gc
which is inefficient, it's not reliable to use the typical "create
a lot of objects and see if it crashes" trick to check leaks.

Skip the test for now until we find a way to test it reliably.

To avoid flakiness in the CI, it's better to remove an unreliable
test altogether.

I found this issue from an integration test run in https://chromium-review.googlesource.com/c/v8/v8/+/4962094

nodejs-github-bot · 2023-10-22T12:12:39Z

CI: https://ci.nodejs.org/job/node-test-pull-request/55120/

joyeecheung · 2023-10-28T20:22:36Z

Actually, I don't think we can use checkIfCollectable() to check this given how the channels are themselves retained by a WeakMap holding Weakrefs. I am inclined to just remove this test as there is no reliable way to detect this leak

Use of gc() in general should be avoided in leak detection tests due to the fact that it disables code aging for the compilation cache (which is why it was failing in the v8 integration test).
Because it relies on ephemeron gc which is inefficient, it's not reliable to use the typical "create a lot of objects and see if it crashes" trick.

joyeecheung · 2023-10-28T20:39:04Z

Updated to just remove this test. @jasnell can you take a look again?

also cc @Qard @theanarkh

nodejs-github-bot · 2023-10-28T20:41:27Z

CI: https://ci.nodejs.org/job/node-test-pull-request/55300/

Qard · 2023-10-29T09:17:25Z

Would checking with FinalizationRegistry help? Could make a ton of instances, register each, and then check if any of them ever triggered the finalizer before the process exits. 🤔

joyeecheung · 2023-10-29T14:24:34Z

Would checking with FinalizationRegistry help? Could make a ton of instances, register each, and then check if any of them ever triggered the finalizer before the process exits. 🤔

That's what I tried with checkIfCollectable() (which does exactly this) and nope it doesn't work, still flaky. Locally the finalizer is triggered after a few instances but in the CI that's a different story. Technically this is also not what FinalizationRegistry guarantees - it's still spec compliant that the finalizer never gets called. I think the lesson learned from #49710 is that FinalizationRegistry with even heap snapshots (which is another inappropriate way to trigger GC) is still unreliable in detecting leaks and leak detection in JS land can only be done as a best-effort. If false positives are inevitable, just don't test it.

Qard · 2023-10-29T23:42:59Z

I mean, diagnostics_channel itself is using FinalizationRegistry for the cleanup this is supposed to be testing, so if FinalizationRegistry is unreliable isn't that a problem? 🤔

Or is this just an issue of the particular behaviour of --expose_gc? Would there be a way to trigger a deeper GC at the C++ level that we could expose for the tests?

joyeecheung · 2023-10-30T09:00:03Z

so if FinalizationRegistry is unreliable isn't that a problem? 🤔

Yes, which is why FinalizationRegistry should be avoided if possible https://github.com/tc39/proposal-weakrefs#a-note-of-caution - it seems the diagnostics channel implementation is doing something the proposal specifically advise against, though that's out of the scope of this PR.

Or is this just an issue of the particular behaviour of --expose_gc? Would there be a way to trigger a deeper GC at the C++ level that we could expose for the tests?

--expose_gc (or specifically the forced GC it allows) just alters the GC mechanism slightly in that it disables code aging (so compiled code is ever going to be GC'ed, which lead to the failure of the original test once there's more (unrelated) compiled code in the heap). V8 intentionally does not provide a public API to "trigger a deeper GC" (not one that’s guaranteed to clear specific objects that you think should be cleared anyway) because GC is supposed to be internal to the JavaScript engine implementation. I think any test that rely on "a deeper GC" to clear specific objects especially to deflake is not really worth the maintenance cost. If the test doesn't flake, it's okay to keep them, but if they flake, it's better to not do an unreliable flaky test.

joyeecheung · 2023-10-30T09:04:45Z

If there are no more concerns about removing the test by Wednesday UTC, I am going to land this change to unblock https://chromium-review.googlesource.com/c/v8/v8/+/4962094 - if it's possible to invent a leak detection test that does not flake, it can always be added later, but I think the current test is just flaky in nature, so it's better to remove it for now.

Qard

Approving to unblock this, but I'd prefer to avoid deleting the test if we can. Would adding the test to a test/**.status file as flaky work rather than deleting it? I don't know enough about how the flaky tagging works to know if that would get things passing while still keeping the test around to unmark in the future if we fix it.

As for diagnostics_channel using FinalizationRegistry, it's not ideal that it might not clean it up, but it was basically a solution to dynamically generated channels (which are discouraged) would otherwise stick around forever. It's an attempt to prevent users from shooting themselves in the foot, but perhaps not 100% successfully. If you've got any ideas on how to do it more reliably I would love to hear them. I think it's probably acceptable as it is though as it's hopefully at least better than definitely leaking. As it is, it just may leak an empty WeakReference instance. 😅

joyeecheung · 2023-11-01T20:11:51Z

I don't know enough about how the flaky tagging works to know if that would get things passing while still keeping the test around to unmark in the future if we fix it.

I think in this case having a flaky test is probably not the best solution, because this would be intentionally making the CI more orange. Maybe we can mark it as SKIP directly if we really want to keep the test?

Qard · 2023-11-01T20:15:18Z

SKIP would be fine too if that gets us to green without throwing out the context that we were testing for this at some point and should probably be finding a better way to test it at some point in the future. 😅

Otherwise, I'm okay with removing the test entirely if there's not a good way to keep it around without it interfering with our CI. Flaky tests are definitely bad tests. 😬

There is currently no reliable way to detect this leak because: 1. We cannot reliably get a reference to the channel from the API to detect finalization without creating another strong reference. 2. This test does gc() and then checks memory usage - however the use of gc() disables code aging which can actually lead to increased memory usage overall, as it is not intended to be used to lower memory usage in the first place. 3. The implementation of diagnostics channels relies on ephemeron gc which is inefficient, it's not reliable to use the typical "create a lot of objects and see if it crashes" trick to check leaks. Skip the test for now until we find a way to test it reliably. To avoid flakiness in the CI, it's better to remove an unreliable test altogether.

nodejs-github-bot · 2023-11-03T21:50:06Z

Landed in 94156e3

joyeecheung · 2023-11-04T10:47:53Z

An interesting idea came up to me about how to test this somewhat more reliably using a (currently somewhat internal) V8 API (v8::debug::QueryObjects) and the CI looks happy with it https://ci.nodejs.org/job/node-test-commit/66307/ - I'll see if it's possible to upstream my V8 patches to expose this to the embedders, some restrictions may need to be added to prevent the embedders from using this too liberally

Qard · 2023-11-04T11:19:32Z

I'm all for more reliable leak checking. 🙂

joyeecheung · 2023-11-04T11:27:07Z

WIP upstream CL in https://chromium-review.googlesource.com/c/v8/v8/+/5006373

There is currently no reliable way to detect this leak because: 1. We cannot reliably get a reference to the channel from the API to detect finalization without creating another strong reference. 2. This test does gc() and then checks memory usage - however the use of gc() disables code aging which can actually lead to increased memory usage overall, as it is not intended to be used to lower memory usage in the first place. 3. The implementation of diagnostics channels relies on ephemeron gc which is inefficient, it's not reliable to use the typical "create a lot of objects and see if it crashes" trick to check leaks. Skip the test for now until we find a way to test it reliably. To avoid flakiness in the CI, it's better to remove an unreliable test altogether. PR-URL: nodejs#50327 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Stephen Belanger <admin@stephenbelanger.com>

There is currently no reliable way to detect this leak because: 1. We cannot reliably get a reference to the channel from the API to detect finalization without creating another strong reference. 2. This test does gc() and then checks memory usage - however the use of gc() disables code aging which can actually lead to increased memory usage overall, as it is not intended to be used to lower memory usage in the first place. 3. The implementation of diagnostics channels relies on ephemeron gc which is inefficient, it's not reliable to use the typical "create a lot of objects and see if it crashes" trick to check leaks. Skip the test for now until we find a way to test it reliably. To avoid flakiness in the CI, it's better to remove an unreliable test altogether. PR-URL: #50327 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Stephen Belanger <admin@stephenbelanger.com>

nodejs/node#50327

* chore: bump node in DEPS to v20.11.0 * module: bootstrap module loaders in shadow realm nodejs/node#48655 * src: add commit hash shorthand in zlib version nodejs/node#50158 * v8,tools: expose necessary V8 defines nodejs/node#50820 * esm: do not call getSource when format is commonjs nodejs/node#50465 * esm: fallback to readFileSync when source is nullish nodejs/node#50825 * vm: allow dynamic import with a referrer realm nodejs/node#50360 * test: skip test-diagnostics-channel-memory-leak.js nodejs/node#50327 * esm: do not call getSource when format is commonjs nodejs/node#50465 * lib: fix assert throwing different error messages in ESM and CJS nodejs/node#50634 * src: fix compatility with upcoming V8 12.1 APIs nodejs/node#50709 * deps: update base64 to 0.5.1 nodejs/node#50629 * src: avoid silent coercion to signed/unsigned int nodejs/node#50663 * src: fix compatility with upcoming V8 12.1 APIs nodejs/node#50709 * chore: fix patch indices * chore: update patches * test: disable TLS cipher test This can't be enabled owing to BoringSSL incompatibilities. nodejs/node#50186 * fix: check for Buffer and global definition in shadow realm nodejs/node#51239 * test: disable parallel/test-shadow-realm-custom-loader Incompatible with our asar logic, resulting in the following failure: > Failed to CompileAndCall electron script: electron/js2c/asar_bundle * chore: remove deleted parallel/test-crypto-modp1-error test * test: make test-node-output-v8-warning generic nodejs/node#50421 * chore: fixup ModuleWrap patch * test: match wpt/streams/transferable/transform-stream-members.any.js to upstream * fix: sandbox is not enabled on arm * chore: disable v8 sandbox on ia32/arm --------- Co-authored-by: electron-roller[bot] <84116207+electron-roller[bot]@users.noreply.github.com> Co-authored-by: Shelley Vohr <shelley.vohr@gmail.com> Co-authored-by: Cheng Zhao <zcbenz@gmail.com>

nodejs-github-bot added needs-ci PRs that need a full CI run. test Issues and PRs related to the tests. labels Oct 22, 2023

joyeecheung added the request-ci Add this label to start a Jenkins CI on a PR. label Oct 22, 2023

github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Oct 22, 2023

This was referenced Oct 23, 2023

CI Reliability 2023-10-23 nodejs/reliability#695

Open

CI Reliability 2023-10-24 nodejs/reliability#696

Open

CI Reliability 2023-10-25 nodejs/reliability#697

Open

jasnell approved these changes Oct 28, 2023

View reviewed changes

joyeecheung force-pushed the leak-gc branch from e008b75 to 72424dd Compare October 28, 2023 20:37

joyeecheung changed the title ~~test: use checkIfCollectable() in diganostics channel leak test~~ test: remove test-diagnostics-channel-memory-leak.js Oct 28, 2023

joyeecheung added the request-ci Add this label to start a Jenkins CI on a PR. label Oct 28, 2023

github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Oct 28, 2023

jasnell approved these changes Oct 28, 2023

View reviewed changes

Qard approved these changes Oct 30, 2023

View reviewed changes

joyeecheung force-pushed the leak-gc branch from 72424dd to bcb6c4a Compare November 3, 2023 12:07

joyeecheung force-pushed the leak-gc branch from bcb6c4a to 1d4243e Compare November 3, 2023 12:09

joyeecheung changed the title ~~test: remove test-diagnostics-channel-memory-leak.js~~ test: skip test-diagnostics-channel-memory-leak.js Nov 3, 2023

joyeecheung added the request-ci Add this label to start a Jenkins CI on a PR. label Nov 3, 2023

joyeecheung added the commit-queue Add this label to land a pull request using GitHub Actions. label Nov 3, 2023

nodejs-github-bot removed the commit-queue Add this label to land a pull request using GitHub Actions. label Nov 3, 2023

nodejs-github-bot merged commit 94156e3 into nodejs:main Nov 3, 2023

github-actions bot mentioned this pull request Nov 4, 2023

CI Reliability 2023-11-04 nodejs/reliability#707

Open

14 tasks

github-actions bot mentioned this pull request Nov 10, 2023

CI Reliability 2023-11-10 nodejs/reliability#713

Open

18 tasks

targos mentioned this pull request Nov 12, 2023

v21.2.0 release proposal #50681

Merged

UlisesGascon mentioned this pull request Dec 12, 2023

v20.11.0 proposal #51124

Merged

codebytere added a commit to electron/electron that referenced this pull request Jan 10, 2024

test: skip test-diagnostics-channel-memory-leak.js

9adba78

nodejs/node#50327

zcbenz pushed a commit to electron/electron that referenced this pull request Jan 12, 2024

test: skip test-diagnostics-channel-memory-leak.js

a788bcc

nodejs/node#50327

zcbenz pushed a commit to electron/electron that referenced this pull request Jan 12, 2024

test: skip test-diagnostics-channel-memory-leak.js

5033625

nodejs/node#50327

codebytere added a commit to electron/electron that referenced this pull request Jan 15, 2024

test: skip test-diagnostics-channel-memory-leak.js

676cd2a

nodejs/node#50327

codebytere added a commit to electron/electron that referenced this pull request Jan 15, 2024

test: skip test-diagnostics-channel-memory-leak.js

0443d12

nodejs/node#50327

codebytere added a commit to electron/electron that referenced this pull request Jan 16, 2024

test: skip test-diagnostics-channel-memory-leak.js

75b9099

nodejs/node#50327

codebytere added a commit to electron/electron that referenced this pull request Jan 18, 2024

test: skip test-diagnostics-channel-memory-leak.js

6d8f0a5

nodejs/node#50327

jkleinsc pushed a commit to electron/electron that referenced this pull request Jan 18, 2024

test: skip test-diagnostics-channel-memory-leak.js

4ec93b8

nodejs/node#50327

codebytere added a commit to electron/electron that referenced this pull request Jan 18, 2024

test: skip test-diagnostics-channel-memory-leak.js

cd3682d

nodejs/node#50327

Uh oh!

test: skip test-diagnostics-channel-memory-leak.js #50327

test: skip test-diagnostics-channel-memory-leak.js #50327

Uh oh!

Conversation

joyeecheung commented Oct 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nodejs-github-bot commented Oct 22, 2023

Uh oh!

joyeecheung commented Oct 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joyeecheung commented Oct 28, 2023

Uh oh!

nodejs-github-bot commented Oct 28, 2023

Uh oh!

Qard commented Oct 29, 2023

Uh oh!

joyeecheung commented Oct 29, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Qard commented Oct 29, 2023

Uh oh!

joyeecheung commented Oct 30, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joyeecheung commented Oct 30, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Qard left a comment

Choose a reason for hiding this comment

Uh oh!

joyeecheung commented Nov 1, 2023

Uh oh!

Qard commented Nov 1, 2023

Uh oh!

nodejs-github-bot commented Nov 3, 2023

Uh oh!

joyeecheung commented Nov 4, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Qard commented Nov 4, 2023

Uh oh!

joyeecheung commented Nov 4, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

joyeecheung commented Oct 22, 2023 •

edited

Loading

joyeecheung commented Oct 28, 2023 •

edited

Loading

joyeecheung commented Oct 29, 2023 •

edited

Loading

joyeecheung commented Oct 30, 2023 •

edited

Loading

joyeecheung commented Oct 30, 2023 •

edited

Loading

joyeecheung commented Nov 4, 2023 •

edited

Loading