-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash on node api add-on finalization #37236
Comments
@legendecas the assumption is that once the environment begins the process of being torn down, no more gc runs will happen. This test AFAICT forces gc runs after |
Is this related: #36868? |
@gabrielschulhof we just identified the problem in random CI failures on our system. The force GC in the repro is to ensure the reproduce is reliable to show how the problem happens in the case. The crashes do happens in a low rate, but I have to say it's not unlikely to happen: in our internal tests the ratio is roughly 1/6 to crash on exit. |
Yes, the crash call stacks seem very similar to the case. I strongly believe it's the same problem. |
@legendecas I ran the repro and it dies reliably even without |
diff --git a/src/js_native_api_v8.cc b/src/js_native_api_v8.cc
index e037c4297d..22932dc6b9 100644
--- a/src/js_native_api_v8.cc
+++ b/src/js_native_api_v8.cc
@@ -199,7 +199,8 @@ class RefBase : protected Finalizer, RefTracker {
void* finalize_hint)
: Finalizer(env, finalize_callback, finalize_data, finalize_hint),
_refcount(initial_refcount),
- _delete_self(delete_self) {
+ _delete_self(delete_self),
+ _is_gone(nullptr) {
Link(finalize_callback == nullptr
? &env->reflist
: &env->finalizing_reflist);
@@ -220,7 +221,10 @@ class RefBase : protected Finalizer, RefTracker {
finalize_hint);
}
- virtual ~RefBase() { Unlink(); }
+ virtual ~RefBase() {
+ if (_is_gone != nullptr) *_is_gone = true;
+ Unlink();
+ }
inline void* Data() {
return _finalize_data;
@@ -270,10 +274,14 @@ class RefBase : protected Finalizer, RefTracker {
protected:
inline void Finalize(bool is_env_teardown = false) override {
+ bool reference_is_gone = false;
+ _is_gone = &reference_is_gone;
if (_finalize_callback != nullptr) {
_env->CallFinalizer(_finalize_callback, _finalize_data, _finalize_hint);
}
+ if (reference_is_gone) return;
+
// this is safe because if a request to delete the reference
// is made in the finalize_callback it will defer deletion
// to this block and set _delete_self to true
@@ -287,6 +295,7 @@ class RefBase : protected Finalizer, RefTracker {
private:
uint32_t _refcount;
bool _delete_self;
+ bool* _is_gone;
};
class Reference : public RefBase { seems to avert the problem without breaking the tests. |
@legendecas in your repro you delete
So, it looks like the double free is not really caused by anything Node-API does wrong. |
NM. Looks like it crashes even if I remove the |
@legendecas I have been able to reduce the test case to legendecas/repro-napi-v8impl-refbase-double-free#1 while still retaining the crash. |
I think the key is the |
The finalizer normally never gets called while a reference is strong. However, during environment shutdown all finalizers must get called. In order to unify the deferring behavior with that of a regular finalization, we must force the reference to be weak when we call its finalizer during environment shutdown. Fixes: nodejs#37236 Co-authored-by: Chengzhong Wu <legendecas@gmail.com>
@gabrielschulhof Yeah, you're right. I found I've been mistaken on the repro. In the reproduction, there are strong references to the object, this is not the case I was intended to show in the first place. After reading #37303 and tried the PR on my internal case and it does crash regardlessly, I discovered that my original crash on double free of v8impl::reference is not caused by strong references but weak references. These references were set weak before the process going to exit - there is a uv ref on the corresponding handle so only after the set weak the process is going to exit. I'll double-check the re-production to make a reliable reproduction on weak references. |
The original code causing the crash is built on top of node-addon-api, and there is no |
The finalizer normally never gets called while a reference is strong. However, during environment shutdown all finalizers must get called. In order to unify the deferring behavior with that of a regular finalization, we must force the reference to be weak when we call its finalizer during environment shutdown. Fixes: nodejs#37236 Co-authored-by: Chengzhong Wu <legendecas@gmail.com>
https://github.com/legendecas/repro-napi-v8impl-refbase-double-free/blob/master/node_addon_api.cc I crafted a mimic of GC on weak references after addon env teardown. The original environment is running in nyc augmented, there are lots of test cases in the case. So I'm assuming there is a chance that gc may be triggered at the time of clean-up (dumping coverage data to disk). Anyway, the crafted reproduction is a quite valid case of node-addon-api/node-api, I suppose the problem is still valid in the case. (SideNote: the new case has been ran with the latest main branch with the fix of #37303, it does crash) |
@legendecas this is the crash I was able to reproduce:
This crashes at https://github.com/legendecas/repro-napi-v8impl-refbase-double-free/blob/master/node_addon_api.cc#L42 because one cannot call a JS function during environment teardown. It ultimately results in a fatal error, because one cannot throw exceptions in environment teardown either. We really need to distinguish between @mhdawson we may want to reconsider the semverity of |
@mhdawson in support of that, if we were to all of a sudden return |
@legendecas I forgot to mention that I got the above stack with 03806a0 (IOW with #37303 already landed). |
@gabrielschulhof I tried on a fresh build of main branch, it looks like there is a significant difference between v14.x with 03806a0 cherry-picked and the main branch results. On v14.x with 03806a0 cherry-picked, the test case does crash on v8impl::RefBase. However, I believe the initial thought is that there are chances that GC may run on the env teardown (not sure on main branch since it disallow javascript execution on env teardown) and get into race conditions as https://github.com/nodejs/node/blob/master/src/js_native_api_v8.cc#L232 suggests. |
@legendecas it sounds like the problem is fixed on the main branch but not on v14.x, right? I will try to repro with v14.x + 03806a0 when I get a chance. It sounds like the backport will be non-trivial. |
@gabrielschulhof on main branch calling into js on tear down will get an exception from v8 that we can not call into js at that time. So this reproduction does not work out - which relies on trigger GC by calling global.gc in js. And the exception thrown by v8 is a string, that will crash node-addon-api for nodejs/node-addon-api#912 (the crash above #37236 (comment)) So I tried to backport 03806a0 on v14.x locally and it did crash on v8impl::reference as expected. |
The finalizer normally never gets called while a reference is strong. However, during environment shutdown all finalizers must get called. In order to unify the deferring behavior with that of a regular finalization, we must force the reference to be weak when we call its finalizer during environment shutdown. Fixes: #37236 Co-authored-by: Chengzhong Wu <legendecas@gmail.com> PR-URL: #37303 Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Michael Dawson <midawson@redhat.com>
The finalizer normally never gets called while a reference is strong. However, during environment shutdown all finalizers must get called. In order to unify the deferring behavior with that of a regular finalization, we must force the reference to be weak when we call its finalizer during environment shutdown. Fixes: nodejs#37236 Co-authored-by: Chengzhong Wu <legendecas@gmail.com> PR-URL: nodejs#37303 Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Michael Dawson <midawson@redhat.com>
A gc may happen during environment teardown. Thus, during finalization initiated by environment teardown we must remove the V8 finalizer before calling the Node-API finalizer. Fixes: nodejs#37236 PR-URL: nodejs#37616
The finalizer normally never gets called while a reference is strong. However, during environment shutdown all finalizers must get called. In order to unify the deferring behavior with that of a regular finalization, we must force the reference to be weak when we call its finalizer during environment shutdown. Fixes: nodejs#37236 Co-authored-by: Chengzhong Wu <legendecas@gmail.com> PR-URL: nodejs#37303 Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Michael Dawson <midawson@redhat.com>
A gc may happen during environment teardown. Thus, during finalization initiated by environment teardown we must remove the V8 finalizer before calling the Node-API finalizer. Fixes: nodejs#37236 PR-URL: nodejs#37616 Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Michael Dawson <midawson@redhat.com>
The finalizer normally never gets called while a reference is strong. However, during environment shutdown all finalizers must get called. In order to unify the deferring behavior with that of a regular finalization, we must force the reference to be weak when we call its finalizer during environment shutdown. Fixes: nodejs#37236 Co-authored-by: Chengzhong Wu <legendecas@gmail.com> PR-URL: nodejs#37303 Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Michael Dawson <midawson@redhat.com>
A gc may happen during environment teardown. Thus, during finalization initiated by environment teardown we must remove the V8 finalizer before calling the Node-API finalizer. Fixes: nodejs#37236 PR-URL: nodejs#37616 Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Michael Dawson <midawson@redhat.com>
A gc may happen during environment teardown. Thus, during finalization initiated by environment teardown we must remove the V8 finalizer before calling the Node-API finalizer. Fixes: #37236 PR-URL: #37616 Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Michael Dawson <midawson@redhat.com>
The finalizer normally never gets called while a reference is strong. However, during environment shutdown all finalizers must get called. In order to unify the deferring behavior with that of a regular finalization, we must force the reference to be weak when we call its finalizer during environment shutdown. Fixes: nodejs#37236 Co-authored-by: Chengzhong Wu <legendecas@gmail.com> PR-URL: nodejs#37303 Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Michael Dawson <midawson@redhat.com>
A gc may happen during environment teardown. Thus, during finalization initiated by environment teardown we must remove the V8 finalizer before calling the Node-API finalizer. Fixes: nodejs#37236 PR-URL: nodejs#37616 Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Michael Dawson <midawson@redhat.com>
The finalizer normally never gets called while a reference is strong. However, during environment shutdown all finalizers must get called. In order to unify the deferring behavior with that of a regular finalization, we must force the reference to be weak when we call its finalizer during environment shutdown. Fixes: nodejs#37236 Co-authored-by: Chengzhong Wu <legendecas@gmail.com> PR-URL: nodejs#37303 Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Michael Dawson <midawson@redhat.com>
A gc may happen during environment teardown. Thus, during finalization initiated by environment teardown we must remove the V8 finalizer before calling the Node-API finalizer. Fixes: nodejs#37236 PR-URL: nodejs#37616 Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Michael Dawson <midawson@redhat.com>
The finalizer normally never gets called while a reference is strong. However, during environment shutdown all finalizers must get called. In order to unify the deferring behavior with that of a regular finalization, we must force the reference to be weak when we call its finalizer during environment shutdown. Fixes: #37236 Co-authored-by: Chengzhong Wu <legendecas@gmail.com> PR-URL: #37303 Backport-PR-URL: #37802 Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Michael Dawson <midawson@redhat.com>
A gc may happen during environment teardown. Thus, during finalization initiated by environment teardown we must remove the V8 finalizer before calling the Node-API finalizer. Fixes: #37236 PR-URL: #37616 Backport-PR-URL: #37802 Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Michael Dawson <midawson@redhat.com>
Has the patch for this issue been merged into Node.js 12.x? I'm getting a |
The finalizer normally never gets called while a reference is strong. However, during environment shutdown all finalizers must get called. In order to unify the deferring behavior with that of a regular finalization, we must force the reference to be weak when we call its finalizer during environment shutdown. Fixes: nodejs#37236 Co-authored-by: Chengzhong Wu <legendecas@gmail.com> PR-URL: nodejs#37303 Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Michael Dawson <midawson@redhat.com>
A gc may happen during environment teardown. Thus, during finalization initiated by environment teardown we must remove the V8 finalizer before calling the Node-API finalizer. Fixes: nodejs#37236 PR-URL: nodejs#37616 Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Michael Dawson <midawson@redhat.com>
The finalizer normally never gets called while a reference is strong. However, during environment shutdown all finalizers must get called. In order to unify the deferring behavior with that of a regular finalization, we must force the reference to be weak when we call its finalizer during environment shutdown. Fixes: #37236 Co-authored-by: Chengzhong Wu <legendecas@gmail.com> PR-URL: #37303 Backport-PR-URL: #42512 Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Michael Dawson <midawson@redhat.com>
A gc may happen during environment teardown. Thus, during finalization initiated by environment teardown we must remove the V8 finalizer before calling the Node-API finalizer. Fixes: #37236 PR-URL: #37616 Backport-PR-URL: #42512 Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Michael Dawson <midawson@redhat.com>
What steps will reproduce the bug?
Repo to re-produce: https://github.com/legendecas/repro-napi-v8impl-refbase-double-free
How often does it reproduce? Is there a required condition?
Always.
What is the expected behavior?
No segment faults.
What do you see instead?
Segment faults on double free of
v8impl::<anonymous>::RefBase
. TheRefBase
s were deleted once one module's napi_env was going to destroy, and the installed weakv8impl::Persistent
s ofv8impl::<anonymous>Reference
was not destroyed and theseRefBase
will be deleted again on finalization callbacks.The text was updated successfully, but these errors were encountered: