[prototype] boost fiber instead of asio for async #16699

fishbone · 2021-06-28T06:21:43Z

Why are these changes needed?

Add grpc client side support of fiber
Use fiber for GetObjectLocationOwner and update it in code.

Related issue number

Checks

I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

ericl · 2021-06-29T18:31:20Z

What's blocking us from merging this? I guess

Feature parity with event stats for ASIO
If we depend on the asio loop for thread safety, need to port all other callbacks on the same loop to fibers as well

cc @pcmoritz @stephanie-wang for comments

ericl · 2021-06-29T18:32:57Z

src/ray/rpc/grpc_client.h

@@ -40,6 +40,14 @@ namespace rpc {
    INVOKE_RPC_CALL(SERVICE, METHOD, request, callback, rpc_client); \
  }

+#define FIBER_RPC_CLIENT_METHOD(SERVICE, METHOD, rpc_client, SPECS)            \


Will this have reduced performance compared to the RPCs run directly on gRPC threads?

How does thread pooling work for fibers, do we have a pool of fiber threads?

In the gRPC thread, it'll move the result out, so it won't reduce the performance.
Comparision:

Previously, we post a job to io_thread pool. Here a job is created in asio pool.

Here, we pass the result to promise with move. The extra cost is moving the structure, which is ok. For example, this is the pub sub message, if the arena is the same, it'll just swap and we always use the default one.

inline PubMessage& operator=(PubMessage&& from) noexcept { if (GetArena() == from.GetArena()) { if (this != &from) InternalSwap(&from); } else { CopyFrom(from); } return *this; }

No, we don't have a pool for fiber thread for this prototype. We just reuse the thread calling CoreWorker which can be optimized later. We should put everything not cpu intensive to one thread (fiber thread) and put cpu intensive to cpu thread pool.

Fiber's thread is like asio thread, the different part is that, for asio, they manage task, which is a closure. For fiber, it also manage context and scheduling of the tasks. In asio, we wrap everything into lambda capture, and we lose call stack. In fiber, it put things into context and makes them resumable.

In the middle of migration, we can share the same thread with asio. I mean if we go this way we can do it granularly.

scv119 · 2021-06-29T19:05:20Z

my major concerns are the team's expertise with fiber.

I was under the impression that fiber was a bit tricky to use, debug and lacks of compiler support.[1].
I know Facebook has tried using fiber and moved away from it [2].
Another thing might take into consideration is that coroutine has become part of C++ 20 standard, which should be very similar to fiber but with language/compiler support.

To summarize my takes on the trade-offs for adopting boost::fiber:

pros: we should see a big win in term of code readability, slightly win of performance. My guess is that most of time we don't have any issues with it.
cons: if weird fiber bug ever happens, it might be very hard to debug (due to limited adoption, heterogeneous runtime environment and compilers) . Also if the C++ standard is moving to coroutine eventually, we might need migrate to coroutines in the future.

ericl · 2021-06-29T20:05:46Z

Could we get some of the benefits of fibers by adding helper libraries for common distributed patterns in asio? For example chained futures and await-all.

…

On Tue, Jun 29, 2021, 12:59 PM Yi Cheng ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In src/ray/rpc/grpc_client.h <#16699 (comment)>: > @@ -40,6 +40,14 @@ namespace rpc { INVOKE_RPC_CALL(SERVICE, METHOD, request, callback, rpc_client); \ } +#define FIBER_RPC_CLIENT_METHOD(SERVICE, METHOD, rpc_client, SPECS) \ In the gRPC thread, it'll move the result out, so it won't reduce the performance. Comparision: - Previously, we post a job to io_thread pool. Here a job is created in asio pool. - Here, we pass the result to promise with move. The extra cost is *moving* the structure, which is ok. For example, this is the pub sub message, if the arena is the same, it'll just swap and we always use the default one. inline PubMessage& operator=(PubMessage&& from) noexcept { if (GetArena() == from.GetArena()) { if (this != &from) InternalSwap(&from); } else { CopyFrom(from); } return *this; } No, we don't have a pool for fiber thread for this prototype. We just reuse the thread calling CoreWorker which can be optimized later. We should put everything not cpu intensive to one thread (fiber thread) and put cpu intensive to cpu thread pool. Fiber's thread is like asio thread, the different part is that, for asio, they manage task, which is a closure. For fiber, it also manage context and scheduling of the tasks. In asio, we wrap everything into lambda capture, and we lose call stack. In fiber, it put things into context and makes them resumable. In the middle of migration, we can share the same thread with asio. I mean if we go this way we can do it granularly. — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub <#16699 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAADUSXC4NUAMCD63CI35KLTVIQ3JANCNFSM47NI72ZQ> .

fishbone · 2021-06-29T20:15:59Z

@scv119 I read the doc and I don't think the arguments in the doc hold a lot. Go coroutines are also stack full and it gives you benefit and it might suffer from stackoverflow. As for context switch, since we'd like to adopt better thread model, it should be avoided.

Besides, for c++20, even the current compiler is not supported well, for adoption it'll take a longer time, maybe 5 years later. I don't treat this one as a solution. Besides, later move from library-based coroutine to compiler supported one is not difficult. I remember you mentioned the code is hard to understand by reading a lot of callbacks right? And also the code flow is much more cleaner compared with the previous one, right?

The performance gain comes from avoiding copying things into lambda capture compared with asio, but it's not the key benefit from this.

fishbone · 2021-06-29T20:24:34Z

@ericl promise/future/then can not give you similar things like this one. Here, everything can be wrapped into one function directly, but with promise future then, basically, we need to divide the big function into parts, like you are chaining functions. Some syntax might look wired, for example, if you have a for loop and you want to decide what to do next with the given result, it's not natural to write with promise/future/then.

To build functions around asio to provide promise/future/then, I'd prefer something built around and under maintaining for a long time.

fishbone · 2021-06-30T06:29:18Z

There are two follow up discussions here which are interesting:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0866r0.pdf
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1520r0.pdf

neverchanje · 2021-07-09T12:57:40Z

I've just seen this PR occasionally when I searched GitHub for issues around boost.fiber. I have no much knowledge around ray. But I'm very interested in this discussion.

I recently built a Postgres database prototype using boost.fiber, and the backend is multi-threaded using a thread-pool like design, but with fiber inside. The network module is based on Boost.Asio. So far I have not met issues stated in http://www.openstd.org/jtc1/sc22/wg21/docs/papers/2018/p1364r0.pdf . For instance, the paper says the fiber runtime can not integrate with tcmalloc. But at least for now my server didn't crash due to tcmalloc.

The debuggability is really an issue. From my experience, application bugs are as easy to be detected as a normal program. But my real concern is that if the library is not widely adopted and stable enough, I may not be able to dig out the library's bug. As I step-by-step look into the gdb call stack of the Boost.Fiber, the code is really complicated to catch up.

I can share my fiber pool implementation if you are interested.

Furthermore, the paper also stated that the fiber runtime needs the entire project to be refactored and all the blocking API must be wrapped (via promise/future) into fiber-aware API. This is true. But there's approach where different runtimes can coexist:

For example, assume now we have a project based on actor model, and when a task is issued, it runs asynchronously and finally gives a result to a callback. To refactor this task with boost.fiber, you can set up a set of isolated threads dedicated for fiber. Then the task is sent to the "fiber-instrumented threads" and execute in synchronous programming style, benefited from fiber. When the task ends, it can send the result back to the "actor-instrumented threads" for finally executing the callback.

void RunTask(std::function<void()> task, task_callback) {
  FiberAsync([](){
    task();
    ActorAsync(task_callback);
  });
}

scv119 · 2021-07-09T15:53:29Z

I connected with @neverchanje, and he shared with me a lot of positive experiences using fiber. Apparently, @neverchanje is few steps ahead of us on evaluating boost::fiber, and we will exchange more details of the fiber evaluation. I learned from this conversation that boost::fiber might not yet support Apple M1 chips, as boost::fiber uses assembly to implement its usercontext/user-level stacks. This could affect M1 support of Ray. cc @wuisawesome @simon-mo

simon-mo · 2021-07-09T16:30:36Z

I learned from this conversation that boost::fiber might not yet support Apple M1 chips, as boost::fiber uses assembly to implement its usercontext/user-level stacks.

This is interesting, as a data point, we already use boost:fiber for async actor support https://github.com/ray-project/ray/blob/master/src/ray/core_worker/fiber.h.

simon-mo · 2021-07-09T16:46:25Z

boost::fiber might not yet support Apple M1 chips, as boost::fiber uses assembly to implement its usercontext/user-level stacks.

@scv119, actually boost:fiber should work in M1. Because boost fiber has assembly for arm and M1 is arm based. We already use boost:fiber, and ray can already compile with arm chip (in graviton and in m1).

scv119 · 2021-07-09T16:48:49Z

That's great to know :)

neverchanje · 2021-07-09T22:07:55Z

I thought it would require some special customization for Mac arm. Good to know it works.

fishbone · 2021-07-10T21:18:47Z

Thanks, @neverchanje for the sharing. I feel the only thing that concerns me is the maturity of the library itself which we need to evaluate more.

As for debuggability, right now, it's hard to list the user thread stack since it's not integrated with gdb. Only the active ones can be listed. But it's not going to make things worse, since right now. we use ASIO, and it can't list tasks there as well. Btw, we won't have multiple physical threads to execute fiber tasks, only one physical thread for all fiber tasks for simplicity.

As for refactoring the entire project for adoption, it's not an issue here as well since our program is already written in async way. We can just add a new API for future/promise and change the async function granually into sync one. The async rise on the gRPC side, and it's not difficult to update it.

fishbone · 2021-07-21T05:44:43Z

I'll close this PR for now since we've decided to experiment with GCS services.

fishbone added 3 commits June 28, 2021 05:39

init

234ee2a

up

bb0f0cd

format

58c6212

fishbone added the do-not-merge Do not merge this PR! label Jun 28, 2021

fishbone assigned ericl, scv119 and pcmoritz Jun 28, 2021

ericl reviewed Jun 29, 2021

View reviewed changes

ericl removed their assignment Jun 29, 2021

up

53aa713

fishbone added 7 commits July 14, 2021 06:48

update

359d030

thread pool

1c6689e

Merge remote-tracking branch 'upstream/master' into fiber-cb

8c9649b

api in server side

bff3b7d

correct versin

7716ac7

workable

eec503b

work

c6baab5

fishbone added 7 commits July 15, 2021 18:31

up

ed95e69

up

dda1320

up

7eb67bb

up

33efada

up

a288c6d

format

4457ce5

build update

d2408e1

fishbone closed this Jul 21, 2021

fishbone deleted the fiber-cb branch November 23, 2021 05:54

fishbone restored the fiber-cb branch November 23, 2021 05:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[prototype] boost fiber instead of asio for async #16699

[prototype] boost fiber instead of asio for async #16699

fishbone commented Jun 28, 2021 •

edited

Loading

ericl commented Jun 29, 2021

ericl Jun 29, 2021

fishbone Jun 29, 2021

scv119 commented Jun 29, 2021 •

edited

Loading

ericl commented Jun 29, 2021 via email

fishbone commented Jun 29, 2021

fishbone commented Jun 29, 2021

fishbone commented Jun 30, 2021

neverchanje commented Jul 9, 2021 •

edited

Loading

scv119 commented Jul 9, 2021 •

edited

Loading

simon-mo commented Jul 9, 2021 •

edited

Loading

simon-mo commented Jul 9, 2021 •

edited

Loading

scv119 commented Jul 9, 2021 •

edited

Loading

neverchanje commented Jul 9, 2021

fishbone commented Jul 10, 2021

fishbone commented Jul 21, 2021

[prototype] boost fiber instead of asio for async #16699

[prototype] boost fiber instead of asio for async #16699

Conversation

fishbone commented Jun 28, 2021 • edited Loading

Why are these changes needed?

Related issue number

Checks

ericl commented Jun 29, 2021

ericl Jun 29, 2021

Choose a reason for hiding this comment

fishbone Jun 29, 2021

Choose a reason for hiding this comment

scv119 commented Jun 29, 2021 • edited Loading

ericl commented Jun 29, 2021 via email

fishbone commented Jun 29, 2021

fishbone commented Jun 29, 2021

fishbone commented Jun 30, 2021

neverchanje commented Jul 9, 2021 • edited Loading

scv119 commented Jul 9, 2021 • edited Loading

simon-mo commented Jul 9, 2021 • edited Loading

simon-mo commented Jul 9, 2021 • edited Loading

scv119 commented Jul 9, 2021 • edited Loading

neverchanje commented Jul 9, 2021

fishbone commented Jul 10, 2021

fishbone commented Jul 21, 2021

fishbone commented Jun 28, 2021 •

edited

Loading

scv119 commented Jun 29, 2021 •

edited

Loading

neverchanje commented Jul 9, 2021 •

edited

Loading

scv119 commented Jul 9, 2021 •

edited

Loading

simon-mo commented Jul 9, 2021 •

edited

Loading

simon-mo commented Jul 9, 2021 •

edited

Loading

scv119 commented Jul 9, 2021 •

edited

Loading