Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Ref #993

Closed
SystemFw opened this issue Nov 24, 2017 · 26 comments
Closed

Improve Ref #993

SystemFw opened this issue Nov 24, 2017 · 26 comments
Assignees
Milestone

Comments

@SystemFw
Copy link
Collaborator

SystemFw commented Nov 24, 2017

This can be fleshed out, but for now let's collate the ideas about things worth improving in Ref.

  • Is Actor necessary or can we use something more lightweight? What's the actual overhead of Actor ?

  • The behaviour of setSync(fa: F[A]) and setAsync(fa: F[A]) has been weird for a while. A failure in fa will make them succeed, and the subsequent get will fail. setSync could be easily fixed, and probably replaced by fa.flatMap(setSyncPure(a)) altogether. Can we do the same with setAsync? Is having setAsyncPure enough (so that the fa will run synchronously, and the setAsyncPure asynchronously)

  • Related to the above, if actions in set* can't fail, do we need Either[Throwable, A] as the state of the Ref? Shouldn't a reference to A be enough? Things like start could be recovered via an explicit Ref[F, Either[Throwable, A]]

  • As a counterpoint to the above two points, what about modifyF (RFE: Ref/Signal modifyF #888). To have the expected semantics (effect only run once, Change is faithful), the only way I could envision is sending a closure (or an IO) to the Actor, which is only run when it wins the contention race. Unclear at the moment if that invalidates the actions that set the Ref shouldn't be able to fail hypothesis, and therefore cement the need for having an Either[Throwable, A] as the state of Ref.

  • As a broader point, users now have runAsync and shift in their API, which they didn't before, which might influence how (or whether) we rethink Ref

If this sounds non-organic and confused, it's because it still is (in my head), please challenge/correct.

Relevant discussions:
#989
#986
#888
#974 (initial push, not much in terms of discussion of Ref)

@mpilquist @pchlupacek

@pchlupacek
Copy link
Contributor

@SystemFw Just a small clarification #989 did not introduce the Ref, was already there :-) Also note that every nondet code in fs2 is using Ref, i.e. merge, join, essentially any Async operation at every single step.

@SystemFw
Copy link
Collaborator Author

SystemFw commented Nov 24, 2017

True, thanks. I'll just remove that point.

@pchlupacek
Copy link
Contributor

@SystemFw just maybe for some thoughts re performance:
given this program https://gist.github.com/pchlupacek/d80a947ac4fdd6f72649a5ae57cced59
this are the results of sync/async comparison

Ops: 10000000
Execution of : SYNC: setSyncPure took 0.739 s
Execution of : SYNC: setSyncPure took 0.618 s
Execution of : ASYNC: setSyncPure took 13.774 s
Execution of : SYNC: setAsyncPure took 0.725 s
Execution of : ASYNC: setAsyncPure took 2.939 s
Execution of : SYNC: get took 1.661 s
Execution of : ASYNC: get took 18.168 s
Execution of : SYNC: modify took 1.612 s
Execution of : ASYNC: modify took 19.58 s

Process finished with exit code 0

which imho clearly shows the overhead of the actor is significant.
Would be also interesting compare actor with sync ref in high-contention scenario where many threads fight for the one to be complete.

@SystemFw
Copy link
Collaborator Author

SystemFw commented Nov 27, 2017

I'm making decent progress in removing Throwable from the internals of Ref.
Ref[Either[Throwable, A]] allows to have the same behaviour explicitly (much clearer imho).
The old behaviour of setAsync can be obtained using fork.

@pchlupacek
Copy link
Contributor

Sort of ad hoc idea, but not sure about the impacts. I don't like the fact that Ref may be bot set/unset initially. How about having SetRef and UnSetRef, where SetRef will have get and UnSetRef will not have get but only getIfSet? same I guess for modify.

@SystemFw
Copy link
Collaborator Author

SystemFw commented Nov 27, 2017

What would be the semantics of getIsSet?
Generally speaking, I like the behaviour of get as it is, as it allows Ref to be used as a proper concurrency primitive a la MVar, rather than just a thread-safe IORef. In particular, the concurrent data structures make great use of this.
Also, in an ideal world (and maybe in the real one too, we'll see :P), I'd like to have Ref be so fast that we don't need SyncRef anymore, i.e. I'd like less Ref types, not more.

Now, if you mean that getIfSet should behave like the current get, and you also want another datatype with a get that's guaranteed to return, I believe using SyncRef already gives your that, doesn't it?

Thoughts?

@mpilquist
Copy link
Member

Also, in an ideal world (and maybe in the real one too, we'll see :P), I'd like to have Ref be so fast that we don't need SyncRef anymore, i.e. I'd like less Ref types, not more.

Remember we also need SyncRef for type constructors with Sync instances and without Effect instances.

@SystemFw
Copy link
Collaborator Author

SystemFw commented Nov 27, 2017

Ah, didn't think about that, that's very good point

@pchlupacek
Copy link
Contributor

GetIfSet would behave completely like current get, but will be available only on Unset ref, whoch in turn will miss get. Similarly SetRef will have only get but will be missing getIfSet. Is more like tupesafe way to distinguish between set/uset ref. Also i believe that setref could be then in fact SyncRef as it is today.

@pchlupacek
Copy link
Contributor

I have been thinking about this a more, and I think we don't need AsyncRef and SyncRef. I think currently, all the functionality of the AsyncRef may be implemented with the AtomicReference except one, and thats the initial set/get of the UnSetReference.

So essentially I think we should end up with SetRef that is essentially current SyncRef. As a second we shall have the UnsetRef that behaves differently when performing get and modify for the first time, and thus I suggest these two operations have a different name, as suggested before. UnsetRef also won't have get and modify available.

This will also allow, that both references will require only Sync[F] to be available at creation time (even though that is perhaps not necessary at all) and have EC and Effect[F] to be implicitly available as a context to getIfSet and modifyIfSet (not stick to the naming, but I think they have to have different name, as they do have different behaviour).

Note that this also solves the situation, where you want to supply your own EC when performing async operations (ie. setAsync, getIfSet, ...) . Currently that is not possible, as the EC is inherited from the context where Ref was created. Note that this actually allows proper F.shift to be used as intended (and not automagically inserted from time the Ref was created.

Also I believe that this approach may lead to Ref that does not have F as type parameter, As I am not sure we really need that type param. So for environment where you don't want to do a lot of F ~> G translations, this may be another performance benefit (i.e. you can set ref with IO, and read it with Future). Obviously having the F less Ref, could mean we would be able to define a lot of asynchronous utilities F less too.

Any thoughts?

@SystemFw
Copy link
Collaborator Author

SystemFw commented Nov 28, 2017

Just for the record, yesterday night I started an attempt to build a Ref (including the semantic blocking on empty) using AtomicReference only. At the moment it's broken (and I have to add more tests to RefSpec anyway, if we're going to touch it more deeply), but you can get an idea, and perhaps offer suggestions as to why it's not working yet, here:
https://github.com/SystemFw/fs2/blob/feature/ref-noActor/core/shared/src/main/scala/fs2/async/Ref.scala

EDIT: Just found (one of) the bugs

@mpilquist
Copy link
Member

Sounds great to me. A single implementation would make it easier to include in cats-effect too.

@SystemFw
Copy link
Collaborator Author

General fs2 test suite passing with an Actor-less Ref.
Would be nice to have a more comprehensive suite for Ref only, and some benchmarks to see if this is faster as @pchlupacek expected, or not.

@pchlupacek
Copy link
Contributor

@SystemFw I think the AsyncRef could be fully implemented with SyncRef. Generally the idea would be that Async or UnsetRef would be implemented with SyncRef[Either[A, Vector[F[Unit]]] or perhaps thanks to LiftIO we could do SyncRef[Either[A, Vector[IO[Unit]]]. So only thing you really need to implement is the asynchronous callback when ref is not set, and on set making sure to start all IOs.

@pchlupacek
Copy link
Contributor

@SystemFw when you will be done we can run the simple test above. ...

@SystemFw
Copy link
Collaborator Author

SystemFw commented Nov 28, 2017

The version I have now seems to be working, but it maintains the general idea of the old Ref. Once I consolidate it a bit we can decide if it's Ok, or if we should be making an attempt to revise the implementation strategy more deeply (e.g. by having it built on top of SyncRef).
At the moment the actor messages are replaced by nonblocking implementations using AtomicReference directly, and the rest of the implementation is as it used to be, except that instead of sending a message to the actor, it calls the nonblocking "primitives" via async

@SystemFw
Copy link
Collaborator Author

That simple test is a good start yeah, but probably needs to be expanded to test contention as well.

@pchlupacek
Copy link
Contributor

I would say that simple test shown rationale, if we would be close to it ti should give us enough confidence that we didn't slow things down.

@pchlupacek
Copy link
Contributor

@SystemFw
Copy link
Collaborator Author

SystemFw commented Nov 28, 2017

We might not need that schedule indeed. At first, I tried to replicate the behaviour of the Actor closely (and submitting a message there would trigger a submission I think), but perhaps some of those can be eliminated. I'll ping you in an RFC PR at some point today, and we can dissect it :)

@pchlupacek
Copy link
Contributor

ok sure perfect. Good job :-)

@SystemFw
Copy link
Collaborator Author

Initial benchmarks show pretty much identical performance for Ref implemented with or without Actor (lol) compared to SyncRef, i.e. a lot slower. However the non-Actor implementation allows taking a closer look at what's going on from a code point of view. I'll keep investigating.

@pchlupacek
Copy link
Contributor

@SystemFw, interesting. I think this comes to complexity which unset state brings to the table. I think we have to look on this, really I believe that hence the unset is only done once in whole lifetime of ref, we shall really get away all that orchestration whenever possible.

@SystemFw
Copy link
Collaborator Author

Yeah, I opened a PR so we can start analysing what's going on

@durban
Copy link
Contributor

durban commented Nov 28, 2017

FYI, I've also opened a PR related to this: #1007.

@SystemFw
Copy link
Collaborator Author

The code in #1006 was mostly a proof of concept to show that it can be done. I'm aiming at a more radical redesign @durban

@SystemFw SystemFw self-assigned this Nov 28, 2017
@mpilquist mpilquist added this to the 0.10 milestone Jan 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants