-
Notifications
You must be signed in to change notification settings - Fork 30
How to know if all the receivers has been dropped? #61
Comments
Dropping receivers doesn't close the channel anymore, but one can still relatively easily implement the old interface where they do. See this for an example: https://github.com/crossbeam-rs/crossbeam-channel/blob/master/examples/mpsc.rs |
That's not a trivial amount of work you linked to and I have this use case too - I'm writing a program with one main thread that reads from a receiver with several worker threads that send messages to it and exit if the main thread terminates and drops its receiver at any time while they're blocking on std/mpsc send operations. I'd like to use crossbeam_channel instead because it's a better library... except for this one feature that I'd have to reimplement myself. (Aborting the program when the main thread exits wouldn't be enough because my unit tests would block without this behavior.) Since we have more than one use case for this feature, it looks to me like it would be better in this library than in call sites of this library. |
@Flaise How about the following solution? // Spawns a thread and returns a channel that signals its termination.
fn spawn(f: impl FnOnce()) -> Receiver<()> {
let (s, r) = channel::bounded(0);
thread::spawn(move || {
f();
drop(s);
});
r
}
// The channel for sending messages.
let (s, r) = channel::unbounded();
// Create the main thread.
let done = spawn(move || {
loop {
let msg = r.recv();
// Do something...
}
});
// Creater worker threads.
for _ in 0..WORKER {
let s = s.clone();
let done = done.clone();
spawn(move || {
loop {
let msg = produce_message();
select! {
send(s, msg) => {}
recv(done) => break
}
}
});
} |
I think this is a very common situation that needs to be detected. And if all receivers are dropped, and sender keeps sending messages, the program may OOM, which is not obvious to tell and debug. |
Duplicate issue: #55 Pinging people who might be interested: @vorner @jdm @SimonSapin @BurntSushi @glaebhoerl @danburkert @matthieu-m @coder543 Here's a suggestion on how we might solve this issue. Several people already have expressed interest in being able to detect whether all receivers have been dropped during Currently, If we change the behavior of Instead, I propose we simply add a new method to impl<T> Sender<T> {
// Just like `send`, except it checks for dropped receivers.
// Might block if the channel is full.
pub fn checked_send(msg: T) -> Result<(), NoReceiversError<T>>;
}
struct NoReceiversError<T>(pub T); How does everyone feel about this? Would that solve the problem? Alternatively, impl<T> Sender<T> {
// Just like `send`, but never blocks and checks for dropped receivers.
pub fn checked_send(msg: T) -> Result<(), CheckedSendError<T>>;
}
enum CheckedSendError<T> {
Full(T),
NoReceivers(T),
} I'm personally leaning towards the first suggestion. Bikesheddable names:
|
It seems to me that checking for dropped receivers is intrinsically complicated, and there are multiple issues to resolve beyond "check on First of all, there's the problem of leaked receivers. Whether they are leaked intentionally ( Putting aside leaked receivers, there is still no acknowledgement of processing (by default). A sender can check that a receiver exists when it enqueues an item, but cannot guarantee that a receiver will eventually process such item: there is a data-race between checking whether the receiver is still alive and the receiver's death, and it could very well die after sending was successful. Which brings us to a third problem; even if a sender detects, upon sending, that no receiver exists any longer, all it can do is avoid enqueuing more items. What of all the items that were already queued? Should there be an API to get them back? Isn't that akin to creating a new receiver? And of course, there is the issue mentioned that As usual in this case, I'd say we should listen to our MOMs (1). The concern here is reminiscent of guaranteed delivery. In order to implement guaranteed delivery, MOMs will usually provide a feature to address this issue of queued items no longer being deliverable: a dead letter queue. In this case, I am wondering whether a dead letter callback would be suitable. The creator of the queue would be able to register a dead letter callback of the form The callback would be triggered on the sender thread whenever a (1) Message-Oriented Middleware, such as MQ & co. If we look at the 4 problems I gave above, a dead letter callback solves them all:
Note: the ability of last-chance processing of queued items does not lead to guaranteed delivery, should the process crash in the middle, any in-memory item is lost. Guaranteed delivery can only be implemented with point-to-point acknowledgement between persistent stores up until the final destination (which can either store or process the item before acknowledging). Alternatives:
|
My motivation for bringing the issue up was mostly for accidental subtle bugs that happen on rare cases. In other words, protection against not handling situation you don't know you should handle. In that light, I don't think adding another sending method that is more complex and people will probably not be interested in doesn't solve my worry ‒ people who don't know about the potential problem won't look for the method and won't use it, still getting the subtle bugs. That being said, the method does look useful for some use cases where you already do know about the possibility of lost receivers. |
Yeah, I think my position is mostly unchanged here. I do like @vorner's comment in that opportunistic deadlock detection might be possible, and that seems like a fine idea to me. |
Ok, here's another idea. Maybe we could get back to the old interface for send and recv methods, where dropping receivers closes the channel: impl<T> Sender<T> {
fn send(&self, msg: T) -> Result<(), SendError<T>>;
fn try_send(&self, msg: T) -> Result<(), TrySendError<T>>;
}
impl<T> Receiver<T> {
fn recv(&self, msg: T) -> Result<T, RecvError>;
fn try_recv(&self, msg: T) -> Result<T, TryRecvError>;
} Now the question is what happens to let r: Receiver<T> = ...;
let rs = Vec<Receiver<T>> = ...;
let s: Sender<T> = ...;
let ss = Vec<Sender<T>> = ...;
select! {
res = recv(r) => match res {
Ok(msg) => println!("received {}", msg),
Err(_) => println!("channel closed"),
}
(res, r) = recv(rs) => match res {
Ok(msg) => println!("received {} from {:?}", msg, r),
Err(_) => println!("channel {:?} is closed", r),
}
res = send(s, foo) => match res {
Ok(()) => println!("sent message"),
Err(msg) => println!("couldn't send {} because the channel is closed", msg),
}
(res, s) = send(ss, foo) => match res {
Ok(()) => println!("sent message into {:?}", s),
Err(msg) => println!("couldn't send {} into {:?} because it is closed", msg, s),
}
} But you don't have to use select! {
msg = recv(r) => println!("recv result: {:?}", msg),
res = send(s, "message") => res.unwrap(),
} This is not too verbose and seems pretty intuitive. What do you think? |
Yes, I think this looks mostly good ‒ at least to me :-). There are probably some questions to answer, though. Let's say there are two channels I want to send into, one of them lost all of its receivers, the other one is ready to accept. Are they both considered ready? |
Yes, they are both ready and a random one of those two operations will be completed. Think about it this way: If you'd like to skip operations that send into closed channels, you can achieve that as follows: let mut s1 = Some(s1);
let mut s2 = Some(s2);
// Note that `Option<Sender<T>>` implements `IntoIterator<Item = Sender<T>>`
loop {
select! {
(res, _) = send(s1, msg) => match res {
Ok(()) => break,
Err(_) => s1 = None,
}
(res, _) = send(s2, msg) => match res {
Ok(()) => break,
Err(_) => s2 = None,
}
}
} This is a common idiom in Go. |
@BurntSushi What do you think, is this a solution that might make everyone happy? |
@stjepang I think the change is a bit hard for me to evaluate. Do existing uses of Popping up a level, I remain skeptical here w.r.t. to bidirectional flow. I feel like our conclusion from when we last dove into this is that unidirectional flow has a lot of value, even if it means that folks need to restructure their programs. |
The breaking change is how select! {
// Equivalent to: let res = Receiver::recv(r)
res = recv(r) => {}
// Equivalent to: let res = Sender::send(s, msg)
res = send(s, msg) => {}
} But Also, as a slight digression, I dislike the current
So here's what happened while we were porting Servo to There's a test where a receiving thread panics (thus dropping the receiver) and we want to propagate the error to the sending thread and fail with a stacktrace. Unfortunately, this is not possible with @jdm was rightfully unhappy with the current situation so they decided to wrap This made me very confused for a while. Something was wrong about unidirectionality in In Go, unbounded channels are a highly discouraged antipattern. If Servo was written in Go and used bounded channels, a panicked receiver thread would result in the sender thread filling up the channel and resulting in a deadlock. The Go runtime would detect the deadlock and print a nice stacktrace. Memory wouldn't be leaked indefinitely. In Rust, we prefer unbounded channels for some reason. Our solution for panicked receiver threads is forbidding sending messages into channels without receivers. And we don't have a runtime that detects deadlocks either. If you forbid unbounded channels and have a runtime that detects deadlocks, then unidirectionaly works wonderfully. For that reason, I think it's the right choice for Go. But if you allow unbounded channels or can't detect deadlocks, then you absolutely do need a different mechanism for detecting memory leaks. It's also important to note that Rust has destructors while Go doesn't (channels must be closed explicitly). Different languages, different idioms. |
Just as a bike-shedding point. That
|
This is an insightful observation! Thanks for pointing it out! I agree that the failure modes of unbounded channels aren't great here. However, I'd like to push back just a bit. In particular, can we nudge folks to use bounded channels instead? That is, instead of improving unbounded channels at the cost of more API complexity for all channels, we could instead declare that poor failure modes are a property of unbounded channels and in order to get better failure modes, one should use bounded channels instead. |
I personally lean towards the "don't use unbounded channels" camp, but the issue is far from being clear-cut. There was a big debate on bounded vs unbounded channels in Servo's mailing list and they ultimately decided to go with unbounded channels. However, there seems to be a consistent pattern:
Servo's constellation is more reminiscent of actors and, unsurprisingly, uses unbounded channels. Even though It's interesting to note that even though actor-based systems could benefit from bounded channels (backpressure), they still choose not to (at least not by default). I'm not an expert in the field and can't explain why exactly, but there must be a deeper reason for it. |
I'm doing exactly that in Kompact.
Because a proper actor systems (unlike Actix) have ad-hoc connection semantics, where references can be passed around arbitrarily like any data and everyone can send to someone they hold a reference to. Typically, Actor systems are meant to be highly dynamic in structure, with actors constantly being created and dropped, forming new connection topologies based on their internal semantics. Since keeping track of references in a system like this is essentially impossible, preventing deadlocks in a backpressured system would be a major headache. That is why Akka has their Akka Streams system now, where you can do properly typed and backpressured executions, but you are essentially building dataflow graphs explicitly, so that you can see what things are connected and thus (hopefully) prevent deadlocks. |
I am happy to see this being considered again. There's tangible benefits providing such API. AFAIK there's no feasible way of providing a completely foolproof detection of absent receivers, at least not without runtime support. And that's ok. I am super late to the discussion but I could parse some concerns, but mostly arguable/opinionated (IMHO). So, are there significant drawbacks providing this? |
Please do!
Nothing significant. The biggest question is "how does this change interact with |
I agree. Even if users overlook that, it's not like its worse than sending messages to the void and/or running out or memory. |
I haven't been tracking the exact API changes in question here, but doesn't this mean all |
That will definitely happen in some cases, specially if the user considers consumers going away before the producers a bug. We can bucket the use cases in 4 options
I guess the proposal here is to remove 1 while enabling 2, 3 and 4. |
Again, to be clear, |
I'm probably missing something. What's changing for the bounded case? |
@arthurprs |
Ahh, I got confused for a minute. Doesn't that only show the bound-queue side of the problem? Where you need to use select! with a timeout to protect against gone receivers. Which in turn is a lot more verbose than |
While bounded channels don't suffer from OOM pproblems, they are still prone to deadlocks. Currently, So I would say this change benefits both unbounded and bounded channels. Yes, the |
So if y'all agree, I'd like to do a 180 turn and revert the behavior back to what we had in version 0.1: dropping all receivers closes the channel. The interface would look roughly like this: impl<T> Sender<T> {
fn send(&self, msg: T) -> Result<(), SendError<T>>;
fn try_send(&self, msg: T) -> Result<(), TrySendError<T>>;
// ...
}
impl<T> Receiver<T> {
fn recv(&self) -> Result<T, RecvError>;
fn try_recv(&self) -> Result<T, TryRecvError>;
// ...
}
select!
recv(r) -> res => {} // res: Result<T, RecvError>
send(s, msg) -> res => {} // res: Result<(), SendError<T>>
default => {}
} |
I have begun work that will change the behavior so that dropping all receivers disconnects the channel. Since this is a big breaking change, the next release will be 0.3.0. PR: #106 |
After another big PR, I've published v0.3 that fixes this issue. Notable changes:
Thanks all for your insights! It might not seem much, but the progress of this crate really wouldn't be possible without your feedback! <3 Finally, I'd just like to add that this release to me feels like a huge improvement. Version 0.1 had that silly |
Thx you for taking the feedback, it have been a pleasure to watch the back and forth 👍 Now to find time to test it in my program 😅 |
How to know if all the receivers has been dropped? now that send function no longer returns anything.
crossbeam-channel/src/internal/channel.rs
Lines 299 to 305 in 5050c89
The text was updated successfully, but these errors were encountered: