-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Merge] Improve conception #192
Comments
Thanks for opening this. While I get your concern for simpler code the merge operator is a pretty complex piece. At the same time the new implementation is up to 3 times faster since it avoids spawning all these tasks. Feel free to open a PR but I want to point out some of the requirements you need fulfill;
Happy to discuss more here or in the PR. |
Hi @FranzBusch thanks for those insights. From what I can tell, those requirements are fulfilled by the implementation I have in mind. I would appreciate if you could take a look at the PR I will publish tomorrow. thanks a lot. |
@FranzBusch here is the PR #193 The code is simpler for sure but I decided to reuse and compose AsyncMerge2Sequence in order to make the version with 3 params. I guess we also could do a specific Merge3StateMachine but not sure it worth it. @phausler what do you think ? |
I think merge and friends in the async-algorithms package are not per-se a goal to make them simple; instead we should focus on correctness and performance. Particularly the heap cost of AsyncChannel is probably going to be higher than the alternate approach. So in short I think that we should focus on first off correctness of behavior (which the Sendable requirement for merge is part of that), second we should then focus on task creation overhead (e.g. we should strive for eliminating per-element tasks). Then after those two are done we should focus on reducing heap overhead. And finally for perf we should strive to avoid lock contention. I think both of these approaches at a high level analysis meet the concept of correctness; which is great! And they (again at a 10000 meter view) reduce the per element task creation down to a per side based task creation which is also a distinct improvement. But as we look at potential heap impact and lock contention impact #193 has a slight advantage (at the cost of code complexity). I guess id rather us tackle the hard problems in the async-algos package so that users of this package don't need to. |
I agree with @phausler here. While I think the idea of backing algorithms with other primitives is a nice idea it often times comes at a performance downside. In the case of backing Furthermore, I think @phausler hits it on the head here. We should solve the hard problems here once, in the most performant way possible. It is okay to duplicate some logic if that comes with a significant performance upside. My plan is to expand the test coverage of the |
Hi to both of you. My intent was not to find at all cost a simpler solution. But if we can have something that is simple to understand, and maintain, that is correct and performant enough, my point is that it is worth giving a shot, even if it means reusing established components. you are right in the overhead induced by AsyncChannel and I’m working on a variation using AsyncStream that already seems to improve perf. @FranzBusch from my point of view an AsyncStream is somehow the equivalent of all the mechanics you have to manage your internal buffer. I’m just trying to not reinvent it (or perhaps I misunderstood something?) I will push an update tomorrow with the optimized code. If it still does meet our requirements then I will give up 😀. Have a good one. |
I would argue the code that I wrote is simple since it contains a pretty straight forward state machine and all transitions can be easily seen by just reading over the switch. The only really complicated part is the setup of the task group. However, I understand that reusing something existing looks simpler. While AsyncStream might be more performant even then you can't achieve the same level. Since every AsyncStream comes with a lock and a separate heap allocation for the buffer inside the stream. |
Hi there, I've updated the PR and its description. My work intends to reach several goals:
Some hints about the conception:
|
The PR has been abandoned in favour of the better implementation of @FranzBusch performance wise. Thanks everyone for the nice discussions. |
Hi @phausler, I see an attempt of avoiding creating too much tasks and share Iterators between them in the merge algorithm #185
Although it might work well, the resulting code is pretty complex.
I feel we could leverage async channels as a way of merging elements from several tasks that would consume the upstreams. It would simplify a lot the code. I have a working implementation on a branch that also handles rethrowing. I will open a PR tomorrow to suggest my implementation if you are willing to take a look.
thanks.
The text was updated successfully, but these errors were encountered: