Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New concurrency primitives #1006

Merged
merged 24 commits into from
Dec 4, 2017

Conversation

SystemFw
Copy link
Collaborator

@SystemFw SystemFw commented Nov 28, 2017

This is a first sketch of a Ref that doesn't use Actor. I'm only running a very very simple benchmark, which shows performance to be identical to the Actor version. I think it can be a good starting point, so I reckon having a PR will facilitate discussion.

This PR introduces a new concurrency scheme based on Ref + Promise .
It offers a more orthogonal api which separates concurrent state and synchronisation, a simpler implementation (no Actor), and greatly improved performance (on a ballpark benchmark).
Fixes #993

private val state = new AtomicReference[Ref.State[A]]
state.set(new Ref.State[A](null, LinkedMap.empty, 0))

//TODO in the old implementation Read and Nevermind submit the continuation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likely an oversight / bug.

override def setSyncPure(a: A): F[Unit] =
F.async[Unit](cb => actor ! Msg.Set(a, () => cb(Right(())))) *> F.shift
F.async[Unit](cb => set(a, () => cb(Right(())))) *> F.shift
Copy link
Member

@mpilquist mpilquist Nov 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I just added this shift when doing the SyncRef PR. Without it a test was hanging -- I think because the subsequent actions flatmapped on to the returned F[Unit] were getting run on the actor thread (based on where the callback was being invoked). Perhaps this isn't necessary if we are consistent in only ever invoking callbacks from within an EC.

@mpilquist
Copy link
Member

Overall I really like the idea of getting rid of Actor.

I like the idea of getting rid of forking. Let's do all that stuff at the call site.

Some miscellaneous thoughts on refs:

  • Many use cases are satisfied by SyncRef
  • We need a ref-like type that starts unset for various algorithms
  • Arguably, we could benefit from a ref-like type that supports explicitly clearing its value, returning to unset state
  • Adding unset support to the actor-less Ref in this PR wouldn't be difficult

This leaves me wondering if we should rename this actor-less ref MVar and rename SyncRef to Ref.

@pchlupacek
Copy link
Contributor

just an initial comment. I really think we can do most of the complexity away by defining UnsetRef in terms of SyncRef. I have a feeling that 1:1 implementation with Actor variant won't work nicely here.

@pchlupacek
Copy link
Contributor

Ah @mpilquist excellent idea. with Ref/MVar I really like it. Lets do it!

@pchlupacek
Copy link
Contributor

Also I think we can remove F from Ref and MVar now. Do you agree?

@mpilquist
Copy link
Member

Also I think we can remove F from Ref and MVar now. Do you agree?

How so? Do you mean moving F from type param at trait level down to individual methods?

@SystemFw
Copy link
Collaborator Author

SystemFw commented Nov 28, 2017

One thing I believe this makes possible is moving the EC and the cats-effect constraint at call site.
In particular it can be weakened in several cases (to Async at least)

Ah, you said so already, github race condition

@pchlupacek
Copy link
Contributor

@mpilquist exactly. You can then use MVar/Ref as boundary between two concurrent programs in F, one will be in IO fro example the other one may be scala's Future. :-)

@mpilquist
Copy link
Member

Yep, agreed then assuming we can make it work :)

@pchlupacek
Copy link
Contributor

I think the only places where we need that async functionality is actually MVar. For Ref it is necessary only with asyncXXX methods, and these can be in fact defined with sync (is not just actually async.fork(syncXXX).

@SystemFw
Copy link
Collaborator Author

Sounds like this is worth pursuing, so I'll keep going at it :)

@mpilquist
Copy link
Member

Yeah agreed -- Ref can definitely be implemented without an F and ExecutionContext captured at trait level.

I'm not as sure about MVar since we have the async constructors and need to thread-shift callbacks (either at callback invocation site or via IO.shift on the returned actions). Interested to see though!

@SystemFw Looking forward to what you come up with. :)

@SystemFw
Copy link
Collaborator Author

Yeah I'm not even sure that an unsettable MVar carries its weight tbh. In haskell that makes it quite deadlock prone. But definitely we can improve Ref and move EC down for a start.

@pchlupacek
Copy link
Contributor

@SystemFw I would start with mVar to just have read, and perhaps not implement take at first. I agree that take is error prone and difficult to read.

@SystemFw
Copy link
Collaborator Author

MVar without take is Ref up to renaming (because without take the blocking behaviour on set that MVar has will deadlock).
I can implement a proper MVar if you prefer (and in fact when I originally looked at Ref it was for this reason), but then I realised, when looking at the concurrent data structures, that Ref works just fine and is simpler (in most cases)

@pchlupacek
Copy link
Contributor

@SystemFw up to you :-)

@SystemFw
Copy link
Collaborator Author

SystemFw commented Nov 28, 2017

So a couple of the things we might want to do, e.g. remove forked methods and leave those at call site, or move F , constraints and ECs to individual methods are blocked by the fact that we now have to conform to SyncOps RefOps.

On another note, one thing we might do is have Ref track its state (Set or Unset). When Unset the implementation will stay as it is now (or close), but when Set it could delegate to a SyncRef. That way it would be "slow" on the blocking get and first set/modify, and faster afterwards. I'd have to think what use cases will actually benefit from this though, and if it can actually be made to work.

Thoughts on either paragraph?

@mpilquist
Copy link
Member

Do you mean RefOps? Feel free to delete that.

I thought about swapping out for a SyncRef after first set but you'd still incur some synchronization on the dispatch (even if only an additional atomic reference). I'm not necessarily against it though.

@SystemFw
Copy link
Collaborator Author

Yeah you'd have to dispatch. I only want to add it (since it's going to complicate things) if we can show it's convenient from a performance perspective.

@pchlupacek
Copy link
Contributor

pchlupacek commented Nov 28, 2017

@SystemFw just a side note:-). I really liked @mpilquist idea to put any unset / async behaviour behind mVar, and leave Ref essentially just synchronous with forking. I wouldn't complicate Ref with any of the callbacks etc. Ref shall be very fast, hence it forms primitive for all async code.
On other hand mVar could be optimised only for initial Take/Put operation, as this is how we use it most of the time (i.e. signal of something being ready).

@SystemFw
Copy link
Collaborator Author

@pchlupacek I'm not sure whether you mean having three types (SyncRef, Ref, and MVar), or just renaming SyncRef to Ref and Ref to MVar.
I'm only using the old names (so I'm not calling Ref MVar for now) to avoid generating confusion in the discussion.

@pchlupacek
Copy link
Contributor

@SystemFw SyncRef => Ref, MVar as completely new type. Sorry to be not clear.

@SystemFw
Copy link
Collaborator Author

Yeah I'm not against that. I'll see how far I go along modifying the current Ref, then tackle that.
The thing I'm really not sure about is having MVar be able to be emptied or not.

@mpilquist
Copy link
Member

Yeah I'm not stuck on ability to unset an MVar (or take).

@SystemFw
Copy link
Collaborator Author

Just realised we are going in the direction of having IORef and MVar a la haskell

@mpilquist
Copy link
Member

Yep - not unintentional on my part. :)

@pchlupacek
Copy link
Contributor

yeah, thats what it is exactly :-)

@SystemFw SystemFw force-pushed the feature/ref-noActor branch from d321d01 to 5eefcc1 Compare November 30, 2017 20:20
@SystemFw SystemFw force-pushed the feature/ref-noActor branch from e0e9f2c to a2489d9 Compare November 30, 2017 23:34
@SystemFw SystemFw force-pushed the feature/ref-noActor branch 2 times, most recently from 0a5203c to 7f6a305 Compare November 30, 2017 23:48
}

ref.modify2 {
case s @ State.Set(_) => s -> F.unit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be F.raiseError instead? This way a second set is silently ignored. That way it would at least complain.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for raising error violating the semantics

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mpilquist said he prefers not failing on double set. I have added a trySetSync for cases in which you care (atm, it's just AsyncPull.race, which I will refactor a bit in a separate PR).
I don't have a strong opinion on this, I can see arguments for both.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, silently ignoring subsequent sets is the worst of both worlds. (The 2 worlds being: (1) working mutiple sets and (2) first set succeeds and subsequent sets fail.) If this behavior remains, I'd recommend renaming it to complete or even tryComplete.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be inclined to agree (but I do want set-once semantics). Let's see what Michael thinks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with set-once semantics. I don't have strong opinions on how subsequent sets are signaled. I didn't want them to fail b/c I thought it would be error prone in certain usage patterns but I suppose folks can always call .attempt and explicitly ignore the error if it's appropriate for their use case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will change that F.unit to F.raiseError

@SystemFw
Copy link
Collaborator Author

SystemFw commented Dec 1, 2017

Does anyone know why the old RefSpec is in shared but all the other concurrency tests are in jvm?
I will have the new RefSpec and PromiseSpec in jvm as well, and delete the old RefSpec, I think

EDIT: nope, the reason is explained in #1009 . So these two will go back to shared as they don't have property based tests in them.

@SystemFw
Copy link
Collaborator Author

SystemFw commented Dec 1, 2017

Re: why single set semantics for Promise.

Having multiple sets only makes sense if you can avoid overwriting a previous value that you know the reader has already consumed, which will imply blocking on a Set Promise, which in turn implies the it needs to be able to be Unset, which is basically MVar.

The key realisation however, is that this doesn't buy you much, really. To build the vast majority of concurrent structures and combinators, you still need multiple MVars, i.e. MVar is a building block at the same level of abstraction of Promise, except with more complicated semantics, and a greater chance of deadlock due to multiple points in which blocking can happen. Promise achieves the same in a simpler, easier-to-get-right way.
Indeed, one can build MVar itself using Ref + Promise, but then you could directly build the more complicated structure you'd use MVar for using only Ref + Promise, like the async package shows.

OldRef allowed multiple set only to implement modify, which in turn was only needed because OldRef doubled down as both a synchronisation primitive and a concurrent mutable reference. Now that these two concerns have been split, single set semantics for Promise make the most sense imho.

Now the remaining question is what to do if a user tries to set a Promise twice, the possibilities are:

  • the F returned by set is successful, but has no effect
  • same as above, but a Boolean is returned to signal a successful set or a noOp
  • the returned F fails

Note that scenarios where set has the potential of legitimately being called twice are pretty much "race-like" only, in which case a mechanism to signal the loser that it has lost is useful (so either a Bool or a failed F). The question is whether we want to make this the default behaviour or not, since it imposes a cost on the more common case.

I don't have a strong opinion on this latter point, but at the moment I'm edging towards having a single method for all cases, which fails when called on a Promise that's already Set

Copy link
Contributor

@pchlupacek pchlupacek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @mpilquist so that the second set could be handled in attempt. In fact I would propose that return type to be explicitly suggesting that as F[Either[Throwable, Unit]] instead just F[Unit] so the user is forced to resolve that case.

t0 <- self.cancellableGet.run
(a, cancelA) = t0
t1 <- b.cancellableGet.run
(b, cancelB) = t1
fa = a.run.map(Left(_): Either[A, B])
fb = b.run.map(Right(_): Either[A, B])
_ <- async.fork(fa.attempt.flatMap(ref.setAsyncPure))
_ <- async.fork(fb.attempt.flatMap(ref.setAsyncPure))
_ <- async.fork(fa.attempt.flatMap(x => promise.setSync(x)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we get rid of x?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I left it there because I already have code to make this function simpler, exploiting the fact that set will fail on second set (and that needs the x)

// and the fact that suspend is roughly 2x faster than delay flatMap
def get: F[A] = F.suspend {
val id = new Token
getOrWait(id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just getOrWait(new Token)?

Copy link
Collaborator Author

@SystemFw SystemFw Dec 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new Token is a side effect in this case, because Tokens are compared with reference equality, so I think that suspend it's needed

}

ref.modify2 {
case s @ State.Set(_) => s -> F.unit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for raising error violating the semantics

//
// NOTE: this differs in behaviour from the old Ref in that by the time readers are notified
// the new value is already guaranteed to be in place.
def setSync(a: A): F[Unit] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to trySet

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sprry set

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had renamed it to set, then changed my mind. set: F[Unit] does not have enough information to tell you if the operation is synchronous or asynchronous, especially because it's now up to the user to fork to get async behaviour. setSync is more informative in this respect, and doesn't force you to go read the scaladoc. Not too hung up on this though

// NOTE: this differs in behaviour from the old Ref in that by the time readers are notified
// the new value is already guaranteed to be in place.
def setSync(a: A): F[Unit] = {
def notifyReaders(r: State.Unset[A]): Unit =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make this private so trySet and set may share this implementation

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking trySet can go altogether if we raiseError on second set (which I will do, I think), since one can use handleError or attempt


/** Like [[get]] but returns an `F[Unit]` that can be used to cancel the subscription. */
def cancellableGet: F[(F[A], F[Unit])] = F.delay {
val id = new Token
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really think it is any longer necessary to wrap this in delay, instead I would suggest just to return tuple (F[A], F[Unit]) there.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see reply above. new Token is a side effect, due to the use of reference equality for Token


override def get: F[A] = F.flatMap(F.delay(new MsgId)) { mid => F.map(getStamped(mid))(_._1) }
def access: F[(A, A => F[Boolean])] = F.delay {
def snapshot = ar.get
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think snapshot must be val for this to work correctly

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mm it works regardless since snapshot is evaluated when I put it into the tuple, but val is clearer in any case, so I'll change it to that. Good catch!

Copy link
Collaborator Author

@SystemFw SystemFw Dec 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah no, you're right!
get in access and snapshot is different. Great catch then!
(Not used to side effects anymore :P )

@@ -114,8 +111,8 @@ package object async {
* nowhere.
*/
def race[F[_]: Effect, A, B](fa: F[A], fb: F[B])(implicit ec: ExecutionContext): F[Either[A, B]] =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the either signature, but couldn't be the => F[C] just replaced by flatMap ?

@@ -47,8 +47,8 @@ abstract class Signal[F[_], A] extends immutable.Signal[F, A] { self =>
def get: F[B] = self.get.map(f)
def set(b: B): F[Unit] = self.set(g(b))
def refresh: F[Unit] = self.refresh
def modify(bb: B => B): F[Change[B]] = modify2( b => (bb(b),()) ).map(_._1)
def modify2[B2](bb: B => (B,B2)):F[(Change[B], B2)] =
def modify(bb: B => B): F[Ref.Change[B]] = modify2( b => (bb(b),()) ).map(_._1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to prefix this with Ref? How about moving Change to top-level fs2? We did similar with Scope, and I thing Change i used much more often.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a slight preference (in general) for keeping the top-level small but I can definitely move Change back to where it was if you prefer

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we end up adding MVar, we'll need a top-level Change I think. If not, I'd rather keep it contained in Ref companion to keep top level concepts smaller.

@@ -1,65 +1,65 @@
package fs2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is gonna be implemented?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I already have that code, I'll push shortly

@SystemFw
Copy link
Collaborator Author

SystemFw commented Dec 2, 2017

Replied to review comments, except

in fact I would propose that return type to be explicitly suggesting that as F[Either[Throwable, Unit]] instead just F[Unit]

I agree with raising error on second set, but not with an explicit Either I don't think. The vast majority of code already calls set once (the only exception is races), and I don't want to add a void everywhere. Double setting here feels more suited to a failed F, and you can attempt in those rare cases you need to double set imho

@pchlupacek
Copy link
Contributor

@SystemFw re the Either[Throwable,Unit] signature. I did that for scope too, and in fact was only way to discover few subtle bugs I was trying to catch. I really think we shall do this there, as finding the double sets will be really hard.

@SystemFw
Copy link
Collaborator Author

SystemFw commented Dec 2, 2017

as finding the double sets will be really hard.

I don't understand why this is though. With a double set your task will fail with an exception telling you "you have tried to set this to 1, but it was already set to 2", and you won't be forced to do .flatMap(F.fromEither) everywhere else. Are there any other cases apart from AsyncPull race where a double set is done?

In any case I'll do the rest of the work and we can then return on this, it's just an attempt away :)

@SystemFw SystemFw force-pushed the feature/ref-noActor branch from 0206f36 to 6d5f172 Compare December 2, 2017 18:04
@SystemFw
Copy link
Collaborator Author

SystemFw commented Dec 3, 2017

@mpilquist @pchlupacek I think this PR is now in a mergeable state, bar the final minor decisions over the exact naming and type signatures of some operations.

Please note that I've left a few things to separate PRs:

  • I haven't touched race. I agree with all the proposed changes, but it's a self-contained piece of work so I'd rather have it in another PR. I'm already working on it
  • Benchmarks. It would be nice to benchmark this more extensively. Running the initial benchmark by @pchlupacek shows that Promise is just 2x slower than Ref, and they are both an order of magnitude faster than the old Ref.
  • Pending tests. This change has uncovered three race conditions in existing tests. Two should already be fixed in Try to eliminate race condition in QueueSpec #1008 , Topic: synchronous publish needs fixing.
  • Moving F to the methods. A previous commit explored the possibility of moving F, Effect and the EC down to the individual methods. This is probably the more open ended task.

Let me know what you think

@SystemFw SystemFw changed the title [WIP: do not merge] Explore Ref without Actor New concurrency primitives Dec 3, 2017
@mpilquist mpilquist merged commit 0137171 into typelevel:series/0.10 Dec 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants