Remove unbounded queue on server-side stream reception #553

timbertson · 2022-07-24T02:48:29Z

This is #545 but on top of #552 (i.e. targeting main).

As in the original PR, the first commit is #552, the second commit contains the changes specific to this PR.

ahjohannessen · 2022-07-26T07:54:40Z

I will review when back from holidays. The only branch that is maintained is main.

timbertson · 2022-07-26T10:20:38Z

No worries, thanks. I've closed the 0.x PRs to focus on main 👍

timbertson · 2022-09-14T12:24:33Z

ping @ahjohannessen (no worries if you just haven't gotten to it, just pinging in case you're like me and forget what you were doing when you come back from vacation ;) )

ahjohannessen · 2022-09-15T09:23:13Z

@timbertson Did you look at this as inspiration for your PR?

Also, there is another PR that accounts for backpressure here. Not sure what is the best approach forward.

timbertson · 2022-09-16T07:18:11Z

@ahjohannessen yes, the first commit in this PR is largely a port of #39 updated to main.

I don't believe #503 accounts for backpressure. In the thread, the author describes that backpressure semantics are unchanged compared to main. That is, the implementation will still interoperate with other backpressure-respecting implementations via call.request, but it does not itself respect backpressure. The author of that PR also doesn't seem to consider it necessary to respect backpressure:

grpc-java says free to ignore this and main branch does.

I disagree with this - if you disregard backpressure then you could still write a correct app, but you would have to implement your own backpressure out-of-band (or provision enough memory that your app won't die). But having a library make this choice is clearly suboptimal, and quite surprising given how important backpressure is in fs2.

There will be merge conflicts if we want both changes, I'm not sure what order you'r prefer. If you want to merge that PR first then I will take on the work of updating this PR, but I didn't want to do that prematurely. My (biased) preference would be to focus on correctness (i.e. this PR) before performance, as that other PR would probably just result in my app dying quicker 😅

ahjohannessen · 2022-09-16T21:32:38Z

runtime/src/main/scala/fs2/grpc/client/Fs2StreamClientCallListener.scala

 import cats.implicits._
 import cats.effect.kernel.Concurrent
 import cats.effect.std.Dispatcher
 import io.grpc.{ClientCall, Metadata, Status}

 class Fs2StreamClientCallListener[F[_], Response] private (
    ingest: StreamIngest[F, Response],
+    signalReadiness: SyncIO[Unit],


Any special reason for SyncIO? Have you considered SignallingRef from fs2?

I'm not super fluent in the guts of cats/fs2, so happy to take recommendations.

SyncIO seems useful somewhere in the mix, because:

the invocations from the GRPC listeners need to run sync, so we need either SyncIO or a Dispatcher

the outgoing unary call types set signalReadiness=SyncIO.unit, because there's no need to respect readiness with only one outgoing message.

It'd be wasteful to call a dispatcher in that case just to run F.unit.

Though now that I think about it, a single outgoing message probably never triggers the onReady code path so maybe it doesn't matter? 🤷

I refactored to use SignallingRef, which seems nicer thanks: a1668d7.

As part of that I removed the SyncIO from the StreamOutput class, but it's still used in various listeners due to the above reasoning.

If you prefer consistency and aren't worried about onReady for unary calls being a bit wasteful (since it's probably never called), I think I can just use F across the board.

ahjohannessen · 2022-09-16T21:56:55Z

runtime/src/main/scala/fs2/grpc/server/internal/Fs2UnaryServerCallHandler.scala

-      def startCall(call: ServerCall[Request, Response], headers: Metadata): ServerCall.Listener[Request] =
-        startCallSync(call, opt)(call => req => call.stream(impl(req, headers), dispatcher)).unsafeRunSync()
+      def startCall(call: ServerCall[Request, Response], headers: Metadata): ServerCall.Listener[Request] = {
+        val outputStream = dispatcher.unsafeRunSync(StreamOutput.server(call, dispatcher))


Any way to reduce usage of dispatcher.unsafeRunSync - it is generally harmful for performance in java grpc.

I don't think so. Here the GRPC startCall interface demands a synchronous return value, but we can't build the Ref[F,_] or SignallingRef[F,_] needed by StreamOutput without a dispatcher.

I could hoist it up, i.e. return a F[ServerCallHandler[_,_]] and then startCall could synchronously turn it into a full StreamOutput by supplying the parts needed from ClientCall. But tracing the call chain, everything is synchronous all the way up to unaryToStreamingCall, which I believe is invoked by the generated code, so we'd need to change that to be able to handle a F[ServerCallHandler[Request, Response]].

I'm hoping that since this is only needed for streaming use cases, the overhead will matter less than it does for unary calls.

ahjohannessen · 2022-09-16T22:09:38Z

runtime/src/main/scala/fs2/grpc/shared/StreamOutput.scala

+    c: ClientCall[Request, Response],
+    dispatcher: Dispatcher[F]
+  )(implicit F: Async[F]): F[StreamOutput[F, Request]] = {
+    Ref[F].of(Option.empty[Deferred[F, Unit]]).map { waiting =>


Have you considered Ref[SyncIO] instead to avoid dispatcher?

With the SignallingRef refactor I've removed the dispatcher from this code, but it has only moved. We need a dispatcher somewhere in the code path because the goal is to resume a suspended message send in F, which we can't do from SyncIO.

ahjohannessen · 2022-09-19T11:08:17Z

@timbertson It would be interesting to see if your changes affect performance. Might be a good idea to do benchmarks as @naoh87 did here.

timbertson · 2022-09-20T03:48:07Z

I don't think it should affect the unaryToUnary case, since there's no significant code changes in that path. If java-GRPC calls onReady() for a unary call, that should be the only difference.

I did a few runs with GRPC_SERVER_CPUS=3 GRPC_BENCHMARK_DURATION=60s, there seems to be negligible difference:

v2.4.11:

name	req/s	avg. latency	90 % in	95 % in	99 % in	avg. cpu	avg. memory
scala_fs2	12074	61.53 ms	102.57 ms	115.93 ms	192.67 ms	101.91%	221.49 MiB

This branch:

name	req/s	avg. latency	90 % in	95 % in	99 % in	avg. cpu	avg. memory
scala_fs2	11976	62.48 ms	103.56 ms	119.62 ms	199.16 ms	105.19%	226.46 MiB

(this is using Docker for Mac, so I'm not sure how stable the results are)

ahjohannessen · 2022-09-20T10:27:42Z

@timbertson Could you handle this - I am thinking that we could first cut a version with a suffix like -RC1 and let people test that out for a while, WDYT?

timbertson · 2022-09-20T11:11:27Z

Sounds good to me! 👍

ahjohannessen · 2022-09-20T11:54:13Z

@timbertson Seems like CI is still failing

timbertson · 2022-09-20T12:35:48Z

Ah yep, sorry. It doesn't seem to run when I push, do you have to approve each run or something? I see "1 workflow awaiting approval"

ahjohannessen · 2022-09-20T15:42:10Z

do you have to approve each run or something?

Seems like. It is a setting for Require approval for first-time contributors on the repo.

timbertson · 2022-09-21T01:14:06Z

No worries. I've also raised a PR in my own repo to get faster CI feedback.

I may need help figuring out what to do with these mima issues though. e.g. I added a constructor argument to Fs2StreamClientCallListener, so it reports:

[error]  * method this(fs2.grpc.client.StreamIngest,cats.effect.std.Dispatcher)Unit in class fs2.grpc.client.Fs2StreamClientCallListener does not have a correspondent in current version

Arguably this class should be package-private (I don't think user code should ever call it), so I'm not sure if we care. Is it even possible to add a second backwards-compatible constructor in scala?

ahjohannessen · 2022-09-21T06:19:13Z

Arguably this class should be package-private

Make it so and mark it in mima. We can do a major version and increase Scala to 3.2 in the same go.

timbertson · 2022-09-21T12:04:02Z

I've just marked the two altered classes for now. It might be good to do a sweep on all non-public classes so this doesn't happen again in the future, but I'd prefer to do that in a separate PR if needed, this diff is already plenty big.

timbertson · 2022-09-22T01:42:31Z

Thanks @ahjohannessen 🙂

ahjohannessen · 2022-09-22T06:16:46Z

Great flow control was finally was dealt with :) good job

timbertson added 2 commits July 23, 2022 15:38

Implement backpressure for sending streams

18f6d74

Server: reuse client StreamIngest for incoming stream

8e366c7

ahjohannessen reviewed Sep 16, 2022

View reviewed changes

StreamOutput: use SignallingRef instead of Ref[Deferred]

a1668d7

timbertson added 2 commits September 20, 2022 13:49

add header comment

0da8407

remove Option.when as it's not available in 2.11

318b282

scalafmt

d77c526

remove unused import

55e2a8c

mark some classes as private to fix mima issues

67b0ee6

ahjohannessen merged commit 82f792b into typelevel:main Sep 21, 2022

This was referenced Sep 21, 2022

Add flow control support on client and server. #39

Closed

No throttling of message sending #38

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove unbounded queue on server-side stream reception #553

Remove unbounded queue on server-side stream reception #553

timbertson commented Jul 24, 2022

ahjohannessen commented Jul 26, 2022

timbertson commented Jul 26, 2022

timbertson commented Sep 14, 2022

ahjohannessen commented Sep 15, 2022

timbertson commented Sep 16, 2022 •

edited

Loading

ahjohannessen Sep 16, 2022

timbertson Sep 19, 2022

ahjohannessen Sep 16, 2022 •

edited

Loading

timbertson Sep 19, 2022 •

edited

Loading

ahjohannessen Sep 16, 2022

timbertson Sep 19, 2022 •

edited

Loading

ahjohannessen commented Sep 19, 2022

timbertson commented Sep 20, 2022

ahjohannessen commented Sep 20, 2022

timbertson commented Sep 20, 2022

ahjohannessen commented Sep 20, 2022

timbertson commented Sep 20, 2022

ahjohannessen commented Sep 20, 2022 •

edited

Loading

timbertson commented Sep 21, 2022

ahjohannessen commented Sep 21, 2022 •

edited

Loading

timbertson commented Sep 21, 2022

timbertson commented Sep 22, 2022

ahjohannessen commented Sep 22, 2022

Remove unbounded queue on server-side stream reception #553

Remove unbounded queue on server-side stream reception #553

Conversation

timbertson commented Jul 24, 2022

ahjohannessen commented Jul 26, 2022

timbertson commented Jul 26, 2022

timbertson commented Sep 14, 2022

ahjohannessen commented Sep 15, 2022

timbertson commented Sep 16, 2022 • edited Loading

ahjohannessen Sep 16, 2022

Choose a reason for hiding this comment

timbertson Sep 19, 2022

Choose a reason for hiding this comment

ahjohannessen Sep 16, 2022 • edited Loading

Choose a reason for hiding this comment

timbertson Sep 19, 2022 • edited Loading

Choose a reason for hiding this comment

ahjohannessen Sep 16, 2022

Choose a reason for hiding this comment

timbertson Sep 19, 2022 • edited Loading

Choose a reason for hiding this comment

ahjohannessen commented Sep 19, 2022

timbertson commented Sep 20, 2022

v2.4.11:

This branch:

ahjohannessen commented Sep 20, 2022

timbertson commented Sep 20, 2022

ahjohannessen commented Sep 20, 2022

timbertson commented Sep 20, 2022

ahjohannessen commented Sep 20, 2022 • edited Loading

timbertson commented Sep 21, 2022

ahjohannessen commented Sep 21, 2022 • edited Loading

timbertson commented Sep 21, 2022

timbertson commented Sep 22, 2022

ahjohannessen commented Sep 22, 2022

timbertson commented Sep 16, 2022 •

edited

Loading

ahjohannessen Sep 16, 2022 •

edited

Loading

timbertson Sep 19, 2022 •

edited

Loading

timbertson Sep 19, 2022 •

edited

Loading

ahjohannessen commented Sep 20, 2022 •

edited

Loading

ahjohannessen commented Sep 21, 2022 •

edited

Loading