Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New signal for changed Mutable, non-blocking functions to obtain guards #77

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

A-Manning
Copy link

@A-Manning A-Manning commented Oct 24, 2023

MutableSignalRef, MutableSignal, and MutableSignalCloned all attempt to obtain a read lock on the underlying RwLock when polled. This can block the thread that is polling the signal, if the RwLock is locked for writes.

This may not be desirable behavior. This PR introduces a new signal ChangedSignal, that emits (). It does not obtain a read lock when polled, and therefore will not block when polled.

This PR also introduces non-blocking variants of lock_ref and lock_mut (try_lock_ref and try_lock_mut), similar to those exposed by the standard library RwLock.

ChangedSignal can be used to implement more sophisticated signals, that give the user the option to read an underlying value if desired. For example:

use futures_signals::signal::{ReadOnlyMutable, Signal, SignalExt};

fn choice_to_read<T>(read_only: &ReadOnlyMutable<T>) ->
    impl Signal<Item = ReadOnlyMutable<T>> {
    read_only.signal_changed().map({
        let read_only = read_only.clone();
        move |()| read_only.clone()
    })
}

// Read immediately, or ignore the signal
fn fresh_or_ignore<T>(read_only: &ReadOnlyMutable<T>) ->
    impl Signal<Item = Option<String>> {
    choice_to_read(read_only).filter_map(|read_only| {
        read_only.try_lock_ref().map(|_read_lock| {
            format!("Obtained a read lock")
        }).ok()
    })
}

The signal in the choice_to_read example would be emitted whenever there is a change to the underlying value. The consumer of the signal is in control of when the value is read.
In fresh_or_ignore, a signal is emitted whenever a value is updated IFF it is available to read when polled.

@Pauan
Copy link
Owner

Pauan commented Oct 24, 2023

What's the use case for those Signals? Let's suppose you do something like this...

choice_to_read(mutable).map(|mutable| {
    // ...
})

Multi-threading operations can happen at any time.

So in between the time that the choice_to_read returns the ReadOnlyMutable<T> and the map, the ReadOnlyMutable<T> can be locked.

So choice_to_read cannot guarantee that the ReadOnlyMutable<T> is unlocked.

So the only safe non-blocking option is the try_lock_* methods.

The consumer of the signal is in control of when the value is read.

That is already the case, because futures-signals is a hybrid push-pull system, that means if the consumer doesn't poll the Signal, the Signal will do nothing.

Which means that methods like throttle completely stop all processing of the Signal while it is being throttled.

In fresh_or_ignore, a signal is emitted whenever a value is updated IFF it is available to read when polled.

What is the use case for that? That will just cause stale updates, where the downstream Signal thinks that it has the most up-to-date value, but it doesn't.

Even worse, those stale updates will happen randomly at unpredictable times, which would make debugging a complete nightmare.

This can block the thread that is polling the signal, if the RwLock is locked for writes.

Note that if you need guaranteed non-locking behavior, there is channel which doesn't lock.

It's quite rare to actually need that behavior, in almost every program it's completely fine to lock. Locking is completely normal and very common (and quite fast!).

A locking update takes ~5.6 nanoseconds, compared to ~1.6 nanoseconds for an atomic operation:

#56 (comment)

For reference, 5.6 nanoseconds is 0.000000056 seconds. If a lock is the bottleneck in your program, there's probably something very wrong.

The only time you might care about locking is if you have a hard realtime program which must make strict guarantees about latency. This is not common, and in that case you're most likely not using Signals or Futures in the first place.

And for Signals in particular, the locking is only for a very short duration. If you're locking for a long time... don't do that, that's not good practice in general.

So it would help if you explained why you're trying to avoid locks.

@A-Manning
Copy link
Author

Thank you for taking the time to review and respond to this PR.

Multi-threading operations can happen at any time.
So in between the time that the choice_to_read returns the ReadOnlyMutable and the map, the ReadOnlyMutable can be locked.

It is ok for the ReadOnlyMutable<T> to be locked before, during, and after such an interruption in my example. It might not be ok using existing signals - since a read-lock is obtained on polling, in this scenario a lock would also be held throughout that interruption, if existing signals from Mutable were used. In my example, no read-lock would be obtained or held by the signal poller throughout that interruption, and other threads would be able to write-lock if necessary. If a user desires a notification that a mutable value has changed, but does not want to obtain a read-lock or does not wish to potentially block on poll, this is a solution.

So choice_to_read cannot guarantee that the ReadOnlyMutable is unlocked.

That is correct. It could be combined with the try_lock_* methods if non-blocking guarantees are desired. There is no guarantee that the ReadOnlyMutable is unlocked when polling.

That is already the case, because futures-signals is a hybrid push-pull system, that means if the consumer doesn't poll the Signal, the Signal will do nothing.

How would one use existing or custom signal adapters to get a notification that a Mutable value has changed, without ever obtaining at least a read lock on the Mutable on poll? An adapter would have to poll eg. a MutableSignalRef at least once in order to set a waker, so there would be at least one attempt to obtain a lock. Without polling the MutableSignalRef, there is also no way to distinguish the reason that the signal is being polled - even if you could know that a poll is due to a waker call, there is no way to distinguish if the waker was called due to an update, or because all Mutables referencing the value have been dropped.

Which means that methods like throttle completely stop all processing of the Signal while it is being throttled.

throttle would still have to use the existing signals from Mutable, which all attempt to obtain at least a read-lock on poll. How would one use throttle to get notified of changes to a Mutable without attempting to obtain a read-lock on the Mutable?

What is the use case for that? That will just cause stale updates, where the downstream Signal thinks that it has the most up-to-date value, but it doesn't.
Even worse, those stale updates will happen randomly at unpredictable times, which would make debugging a complete nightmare.

This is correct. filter_map was the wrong choice for an example; It would have been better to use map so that the consumer is always aware that a value has changed.

Note that if you need guaranteed non-locking behavior, there is channel which doesn't lock.

A solution involving Channel would be quite unwieldy compared to the same solution using Mutable directly.

It's quite rare to actually need that behavior, in almost every program it's completely fine to lock. Locking is completely normal and very common (and quite fast!).
A locking update takes ~5.6 nanoseconds, compared to ~1.6 nanoseconds for an atomic operation:

...if the lock is not being held. If write locks are held for nontrivial periods of time, and there are many readers, all of those reader threads are going to block until the write lock is released.

The only time you might care about locking is if you have a hard realtime program which must make strict guarantees about latency. This is not common, and in that case you're most likely not using Signals or Futures in the first place.
And for Signals in particular, the locking is only for a very short duration. If you're locking for a long time... don't do that, that's not good practice in general.
So it would help if you explained why you're trying to avoid locks.

Locking and blocking are different issues. Mutable's signals generally obtain locks quickly and do not hold them for long unless the user has chosen to do so. However Mutable's signals can block when polled, if the Mutable is write-locked. In the fairly ordinary case that there are many derived signals for a single mutable value, this becomes an issue when write-locks are held for longer periods of time.

Using an async RwLock like Tokio's would at least allow other tasks to make progress while waiting for a Mutable to be unlocked; This is slightly better as only the current task is blocked, as opposed to the current thread. But we don't have to potentially block anything on poll at all just to get a notification that a value has changed. The functions introduced in this PR give the user more control over when, or if, they wish to acquire a lock, and if they want to potentially block when doing so. Even if this crate were to support async locking (perhaps I will make another PR for this), it would still be worth supporting non-blocking ways to acquire those locks, and giving the user a choice regarding blocking on poll. Tokio also supports try_read/try_write, for this reason.

@Pauan
Copy link
Owner

Pauan commented Oct 24, 2023

In my example, no read-lock would be obtained or held by the signal poller throughout that interruption, and other threads would be able to write-lock if necessary.

That is true, however it also doesn't give you the value, it only lets you know that the Mutable updated at some point in the past. In order to obtain the value you would need to lock.

So what use case do you have where you only need to know that a Mutable was updated, and you don't care about its value at all?

It could be combined with the try_lock_* methods if non-blocking guarantees are desired.

And then what will you do if the try_lock fails? Nothing? I'm having a hard time envisioning a situation where that behavior is correct.

throttle would still have to use the existing signals from Mutable, which all attempt to obtain at least a read-lock on poll.

Yes, but my point is that during the throttle the Signal won't be polled at all, and so there won't be any read lock while it is being throttled.

How would one use throttle to get notified of changes to a Mutable without attempting to obtain a read-lock on the Mutable?

You can't, I was giving it as an example of how Signals are consumer-driven, and so there are many situations where it won't lock, because the consumer didn't poll it.

If write locks are held for nontrivial periods of time, and there are many readers, all of those reader threads are going to block until the write lock is released.

You just shouldn't be doing that. In general the entire Future system (including executors) assume that locks will not be held for a long period of time. That's how the entire Future ecosystem works, it goes far beyond just futures-signals.

It's quite bad practice in general in Rust to be holding locks for a long period of time. There are better ways to architect your program.

In the fairly ordinary case that there are many derived signals for a single mutable value, this becomes an issue when write-locks are held for longer periods of time.

Could you give an example? I've never seen any reasonable code that would cause locking issues, especially with Signals.

Using an async RwLock like Tokio's would at least allow other tasks to make progress while waiting for a Mutable to be unlocked

Usinc async RwLock would be a huge performance regression for something which isn't an issue in practice, and it doesn't even fully solve the problem.

Tokio also supports try_read/try_write, for this reason.

To be clear, I have no issue with adding in the try_lock* methods, but I always ask for use cases, because I am very aware of the XY problem, and most of the time there are better ways to accomplish the goal. And so I need to fully understand the use case so I can ensure that the API is the best that it can be.

@A-Manning
Copy link
Author

So what use case do you have where you only need to know that a Mutable was updated, and you don't care about its value at all?

There are various scenarios in which information that a change has taken place is sufficient to trigger some kind of action.

  • A Mutable might hold information regarding bids in an auction. Knowing that the Mutable has changed is enough to know that at least one person has outbid the previous leading bid (the new leading bidder could be the same as the previous, if several updates occurred before polling). This may be enough information to trigger a speculative bid, even without knowing the current price. In the final stages of an auction, it is conceivable that there are a lot of new bids, which might keep a Mutable write-blocked. One might even use a Mutable in conjunction with a Arc<AtomicU64> etc. in order to make the bid count available to the signal consumer without ever blocking.
  • A Mutable might hold a copy of a central limit order book. The updates to this structure may be frequent and irregular. Suppose the consumer of the signal is interested in some orders near the top of the order book, and is also subscribed to a price index. Knowledge that the order book has changed, in conjunction with a recent index price, may be enough to give indication as to whether the orders of interest have been filled, or are no longer near the top of the book. The consumer may choose to cancel or update their orders, or create new ones.
  • A Mutable might hold the latest proposed block header in a cryptocurrency protocol. Knowledge that a new block header exists, even without knowing the block contents or header data, may be enough to speculatively clear some mempool transactions (in particular, in conjunction with Arc<AtomicU64> to know the block height), and begin building a new block.

And then what will you do if the try_lock fails? Nothing? I'm having a hard time envisioning a situation where that behavior is correct.

Some actions might be available even with only the limited information that a change has occurred. Atomics might give additional information without locking.
One could also yield to the underlying executor; once the write in progress completes, the waker will be called, and the signal should be polled again. This may or may not be more efficient than blocking on acquiring a read-lock, depending on how long the write-lock is held for.

You just shouldn't be doing that. In general the entire Future system (including executors) assume that locks will not be held for a long period of time. That's how the entire Future ecosystem works, it goes far beyond just futures-signals.
It's quite bad practice in general in Rust to be holding locks for a long period of time. There are better ways to architect your program.

A long period of time is relative; One would probably consider 50 µs very reasonable, but with a handful of writers and a lot of reader tasks (~1000), lock contention becomes a serious issue. See here for an example involving mutexes, although this also applies to RwLocks in the case that there are a few writers and a lot of readers.

Usinc async RwLock would be a huge performance regression for something which isn't an issue in practice, and it doesn't even fully solve the problem.

Agreed, it should not be the default and should be a separate struct altogether, for the rare cases in which async guards are required. Async waiting for unlock (at least in the context of signals) can be done by using try_lock_* and yielding to the underlying executor, if desired.

@Pauan
Copy link
Owner

Pauan commented Oct 25, 2023

There are various scenarios in which information that a change has taken place is sufficient to trigger some kind of action.

That's fine and all, but what's your use case?

I generally don't add in features based on hypotheticals, because it's always possible to invent hypothetical situations, and so if I add in features just because it might be useful in some obscure situation, that leads to bloat and an infinite increase of features.

One could also yield to the underlying executor; once the write in progress completes, the waker will be called, and the signal should be polled again.

If lock contention is genuinely an issue, it would be easy to just change the implementation of the Signal polling, fixing it directly. That would be far better than adding in new APIs to hack around the problem.

A long period of time is relative; One would probably consider 50 µs very reasonable, but with a handful of writers and a lot of reader tasks (~1000), lock contention becomes a serious issue.

Yes, but thankfully the Signal system is designed so that lock contention is minimized. This is the flow of operations:

  1. You acquire a write lock on the Mutable.
  2. You mutate the value of the Mutable.
  3. The Mutable loops over the dependent Signals and wakes them up.
  4. The write lock is released.
  5. The dependent Signals are polled.
  6. The dependent Signals acquire a read lock, read the value, and then release the lock.

Because the polling of the dependent Signals happens after the write lock is released, under typical situations there shouldn't be any lock contention at all.

Of course you can create contrived situations where contention does happen, but it should be quite rare in practice, especially because the Signal system is designed to have very fine-grained mutability.

futures-signals is designed to have lots and lots of Mutables, in general 1 Mutable per field. So a single struct might contain a dozen Mutables. And those Mutables can then contain sub-Mutables, which can contain sub-Mutables, etc. This spreads out the locks so that contention doesn't happen.

So instead of linking to a thread which talks about hypothetical lock contention (unrelated to futures-signals), could you please explain your situation where you are seeing issues with locks?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants