-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reuse ThreadsafeFunction in EventQueue #739
Reuse ThreadsafeFunction in EventQueue #739
Conversation
Update: it is fixed now! |
4caa935
to
b9864a4
Compare
b9864a4
to
76853a9
Compare
(Force pushed to fix lint errors) |
I left a comment here mentioning a couple of my thoughts: #727 (comment) |
Replied. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty much the implementation I was envisioning, yep. :-) I like the compromise of having cx.queue()
return a shared queue but still allowing EventQueue::new
to create a new one.
I'll leave organizational and documentation feedback to the Neon maintainers. (In particular, I wonder if trampoline.rs should be a submodule of event_queue, since it's not going to be reused for anything else.)
src/trampoline.rs
Outdated
if self.ref_count == 0 { | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably panic, or however else Neon detects internal errors.
src/lifecycle.rs
Outdated
pub(crate) fn threadsafe_trampoline<'a, C: Context<'a>>( | ||
cx: &mut C, | ||
) -> Arc<RwLock<ThreadsafeTrampoline>> { | ||
Arc::clone(&InstanceData::get(cx).threadsafe_trampoline) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: it might be nice to lazily initialize this for programs that don't use EventQueues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like in order to that I'll have to store it in an Arc<RwLock<Option<Arc<RwLock<ThreadsafeTrampoline>>>>>
which doesn't look to nice to me. Is there any better way to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have an exclusive (&mut
) reference to InstanceData, so you don't need extra locking. Option<Arc<_>>
is sufficient.
src/trampoline.rs
Outdated
|
||
pub(crate) struct ThreadsafeTrampoline { | ||
tsfn: ThreadsafeFunction<Callback>, | ||
ref_count: u32, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: if this were AtomicU32, you could avoid the lock. (You'd be able to use Relaxed operations safely because you're only checking the refcount, not relying on any other data to have been synchronized.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly, the lock is needed because .reference()
/.unref()
calls need mutable access to the ThreadsafeFunction
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since they require &mut Context
, it implies the operations are serialized. We could get away with a Cell
instead.
But, that might be an optimization better left for when it's demonstrated to be a performance bottleneck.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I don't imagine this being a bottleneck, and it is way too simple to do it with a mutex. I'm open to changing it, though.
src/trampoline.rs
Outdated
/// _Not idempotent_ | ||
pub(crate) fn reference(&mut self, env: Env) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: Maybe we should make this more obviously distinct from the idempotent nature of ref
by calling it increment_reference_count
or something.
src/trampoline.rs
Outdated
} | ||
} | ||
|
||
// Monomorphized trampoline funciton for calling the user provided closure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: function
src/event/event_queue.rs
Outdated
tsfn, | ||
has_ref: true, | ||
} | ||
Self::with_trampoline(cx, trampoline) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new
returns a referenced trampoline, but with_trampoline
expects an unreferenced one. I think the best answer here is to have with_trampoline
take an already-referenced trampoline, which is consistent with fresh ones and not hard to implement for the shared one. Having it take a not-already-referenced one and immediately referencing it would also work, but would mean extra calls into Node for the fresh case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case it should probably be called with_referenced_trampoline
or something so people don't forget.
@jrose-signal I believe I've addressed your feedback except for one suggestion. PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
src/context/mod.rs
Outdated
let shared_trampoline = { | ||
let shared_trampoline = ThreadsafeTrampoline::new(self.env()); | ||
shared_trampoline.unref(self.env().to_raw()); | ||
shared_trampoline | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still a little less efficient but it's harder to get wrong, so maybe that's worth it.
The latest commit on this PR actually fixes a bug, but it doesn't look like it could be reproduced without the proposed changes so I'm not sure if we should split this commit into a separate PR or handle here as well. Any suggestions are welcome! |
That last bug looks interesting! It seems like it might be a bug in Node/N-API. I'll see if I can make a minimal non-Rust example and open an issue upstream. |
Opened a separate PR to track it: #744 |
crates/neon-runtime/src/napi/tsfn.rs
Outdated
@@ -167,6 +179,12 @@ impl<T: Send + 'static> ThreadsafeFunction<T> { | |||
|
|||
impl<T> Drop for ThreadsafeFunction<T> { | |||
fn drop(&mut self) { | |||
// tsfn was already finalized by `Environment::CleanupHandles()` in | |||
// Node.js | |||
if *self.is_finalized.lock().unwrap() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit worried about the cases where we might end up running release
concurrently to finalize
callback. Maybe this has to be fixed in Node.js instead...
c22f932
to
74a21cb
Compare
I like the way this API is headed. One thing that I'm still not sure about is the asymmetry on Any ideas of how to make this more clear? I have a couple thoughts:
|
@kjvalencik I like the |
@indutny Don't feel the need to jump on an implementation. Right now I'm trying to get a discussion going. I don't have a clear picture of what the API should be. But, I also don't want to bikeshed too long because this optimization is very cool! 🚲 🏠 |
To confirm, nothing actionable for me right now? If yes - what's the plan for this PR? |
Correct, nothing actionable. I'm going to think about the API a bit more and talk with @dherman while working on getting the other changes released. Thanks! |
I like the idea of using In #734 we experimented with renaming |
One more thought: I think just the one |
@indutny Thanks again so much for this PR. I spent a good amount of time thinking about this and also discussing it with @dherman. For both of us, what makes this all fit is There are a few changes I would like to make to this PR:
I know this is asking a lot and you have already been incredibly generous with your time, so I am happy to take over this PR if you would like me to, but I can also help guide it through if you prefer. Cheers! |
@kjvalencik I'll happily look into it. Will let you know how it goes! |
74a21cb
to
3b8ec20
Compare
I performed some benchmarks with this and unfortunately it performs worse in most scenarios and only marginally better in the ideal situation. (I didn't test in Electron, but I suspect it's still a large improvement there because of the very high cost of synchronizing event loops at each tick.) Poking around a bit, I think the primary issue is acquiring a lock to perform Unfortunately, I think the best fix is a more complicated solution that uses an atomic for reference counting instead of a |
0a662b0
to
4780c3b
Compare
src/event/event_queue.rs
Outdated
} | ||
|
||
impl ChannelState { | ||
fn reference<'a, C: Context<'a>>(self: &Self, cx: &mut C) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: self: &Self
is idiomatically written &self
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack!
Node.js optimizes subsequent ThreadsafeFunction invocations to happen
during the same event loop tick, but only if the same instance of
ThreadsafeFunction is used. The performance improvement is most
noticeable when used in Electron, because scheduling a new UV tick in
Electron is very costly.
With this change EventQueue will use an
existing instance of ThreadsafeTrampoline (wrapper around
ThreadsafeFunction) if compiled with napi-6 feature, or it will fallback
to creating a new ThreadsafeFunction per EventQueue instance.
Fix: #727