Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Returning or throwing from a method always triggers an abort #1117

Closed
lourd opened this issue Jun 26, 2024 · 14 comments
Closed

Returning or throwing from a method always triggers an abort #1117

lourd opened this issue Jun 26, 2024 · 14 comments
Labels
bug Something isn't working

Comments

@lourd
Copy link

lourd commented Jun 26, 2024

Describe the bug

Normally returning or throwing from an RPC method triggers the abort event on the given context's AbortSignal. That's very unexpected. I'm not sure if that's intentional or a bug. Hopefully it's not intended, I don't think it should abort every time no matter what. As it stands now you have to remember to remove every abort listener you may have added to avoid accidental downstream aborting, which is really tedious.

To Reproduce

Environment (please complete the following information):

  • @connectrpc/connect-web version: n/a
  • @connectrpc/connect-node version: 1.4.0
  • Frontend framework and version: n/a
  • Node.js version: 20.12.1
  • Browser and version: n/a
// test_service.proto

syntax = "proto3";

package repro.v1;

message FooRequest {
  string name = 1;
}

message FooResponse {}

service TestService {
  rpc Foo(FooRequest) returns (FooResponse);
}
// main.js
import {
  createPromiseClient,
  createRouterTransport,
} from "@connectrpc/connect";

import { TestService } from "./gen/repro/v1/test_service_connect.js";
import { FooRequest, FooResponse } from "./gen/repro/v1/test_service_pb.js";

const transport = createRouterTransport((router) => {
  router.service(TestService, {
    async foo(req, ctx) {
      ctx.signal.addEventListener("abort", () => {
        console.log("ABORTED");
      });
      if (req.name === "throw") {
        throw new Error("thrown!");
      }
      return new FooResponse();
    },
  });
});

const client = createPromiseClient(TestService, transport);
await client.foo(new FooRequest({ name: "test" }));
try {
  await client.foo(new FooRequest({ name: "throw" }));
} catch {}

This will end up printing ABORTED twice, once for the first request when it returns normally and once for the second when it throws

@lourd lourd added the bug Something isn't working label Jun 26, 2024
@lourd
Copy link
Author

lourd commented Jun 26, 2024

Seeing the call to abort here

} finally {
context.abort();
}
added by @timostamm in #632

@srikrsna-buf
Copy link
Member

Hey! This was intentional, but you can use the reason property on signal to differentiate whether it was a normal abort or not.

@lourd
Copy link
Author

lourd commented Jun 26, 2024

Are you open to changing that? That's really confusing and seems like a big pitfall. The case that caused this error for me isn't even aware that it's in a Connect service, it was a couple layers deeper than that and all the sudden is being called after having migrated to Connect from a REST framework. The reason property is also any typed, so comparing that against anything is more of a wish than a guarantee of correctness.

This behavior also isn't documented anywhere. Thinking about how that would be communicated kind of shows how the inherent issue - "The abort signal will be aborted regardless of what happens in your code. It's up to you to check the reason if it was actually aborted or not". Really begs the question of why the library code calling abort() in the first place couldn't do the same thing and not trigger the abort event.

Reading the MDN docs on reason, I really don't think this is proper use. If you don't give a reason when calling abort(), the reason is automatically set to a DOMException. So we would have to check that the reason isn't a DOM exception? This is the behavior in the browser and Node.js

Screenshot 2024-06-26 at 9 17 48 AM

Screenshot 2024-06-26 at 9 58 58 AM

Can you please change this behavior? It appears that the motivation was simply to clear the setTimeout in the deadlineSignal.

cleanup: () => clearTimeout(timeoutId),

Can we change it back to doing that separately? Perhaps making a cleanup method on context that maps to calling deadline.cleanup, and only call that every time instead of abort?

@srikrsna-buf
Copy link
Member

The internal cleanup is not the reason for doing this. The signal can be passed to other services and is guaranteed to abort/complete when the rpc is complete (with or without an error). This makes it convenient to pass this signal around to other services. This is particularly useful in fanout scenarios.

Changing this now will certainly be a breaking change, as it changes the behavior. So we can revisit this in v2 (being worked on now). For now, I'd suggest relying on the using keyword added in TS 5.2 (tc39 Stage 3):

const transport = createRouterTransport((router) => {
  router.service(TestService, {
    async foo(req, ctx) {
      using _ = onAbort(signal, () => {
        console.log("ABORTED");
      });    
      if (req.name === "throw") {
        throw new Error("thrown!");
      }
      return new FooResponse();
    },
  });
});

function onAbort(signal: AbortSignal, onAbort: NonNullable<AbortSignal["onabort"]>): Disposable {
  signal.addEventListener("abort", onAbort)
  return {
    [Symbol.dispose]: () =>
      signal.removeEventListener("abort", onAbort)
  }
}

@lourd
Copy link
Author

lourd commented Jun 27, 2024

Thanks for the thorough response @srikrsna-buf, I appreciate it. That's a good point about the fanout case, having a completion signal that can be used to cancel in process work when one of the workers has finished responding.

Sounds like there's two different cases this one signal is supporting then - aborting and completing. I think it'd be a good idea to separate those into two separate signals. Like I've said, having the abort signal be responsible for both abortion and completion is a pitfall for any code that only expects to be called when things are actually aborted.

The fan-out case can also be enabled by the user making their own abort controller/completion signal and calling that in their own finally, and linking that to the given abort signal. I imagine that's what most people would do these days since there's no documentation about the given abort signal also being called on completion every time. I think that'd be less complicated than example code you've given about using new TS/JS features to remember to remove an abort event listener.

@timostamm
Copy link
Member

Louis, you're right that HandlerContext.signal takes on two different concerns with cancellation and completion.

Now that AbortSignal.any() has been added to the standard, it has become a bit more intuitive to link signals, and we're going to revisit the behavior for v2.

@lourd
Copy link
Author

lourd commented Sep 9, 2024

Sounds good, thanks for letting me know @timostamm. What's the plan for v2? Are changes going into main right now on a v1 track or v2?

@timostamm
Copy link
Member

V2 is mainly for compatibility with protobuf-es v2, but we can make other sensible breaking changes. It lives in the branch v2, and is available as a pre-release.

Once v2 ships, main will be renamed to v1, and v2 will be become main.

@timostamm
Copy link
Member

To split up the concerns, we could add a separate signal to the handler for completion, and/or a separate combined signal (basically renaming the current HandlerContext.signal).

For example, with a separate done signal:

router.service(ElizaService, {
  async *introduce(req: IntroduceRequest, ctx: HandlerContext) {
    const cancelledOrHandlerDone = AbortSignal.any([ctx.signal, ctx.done]);
    longRunning(cancelledOrHandlerDone);
    yield { sentence: `Hi ${req.name}, I'm eliza` };
  },
});

It would also be possible to add a promise, for example settled:

router.service(ElizaService, {
  async *introduce(req: IntroduceRequest, ctx: HandlerContext) {
    const controller = new AbortController();
    void ctx.settled.then(() => controller.abort(), (reason) => controller.abort(reason));
    longRunning(controller.signal);
    yield { sentence: `Hi ${req.name}, I'm eliza` };
  },
});

But IMO it's most readable to let users use try...finally with their own AbortController:

router.service(ElizaService, {
  async *introduce(req: IntroduceRequest, ctx: HandlerContext) {
    const done = new AbortController();
    try {
      const cancelledOrHandlerDone = AbortSignal.any([ctx.signal, done.signal]);
      longRunning(cancelledOrHandlerDone);
      yield { sentence: `Hi ${req.name}, I'm eliza` };
    } finally {
      done.abort();
    }
  },
});

@srikrsna-buf, what do you think?

@srikrsna-buf
Copy link
Member

I like the third option, changing the signal to only abort on an error and letting the user deal with completion looks readable to me too.

@timostamm
Copy link
Member

Re-visiting this for #1253:

My concern is that without a mechanism for detecting doneness in all cases, resource cleanup will be very hard for developers, and they'd have to make their own try/catch wrappers...

A valid concern. @lourd, you originally wrote:

As it stands now you have to remember to remove every abort listener you may have added to avoid accidental downstream aborting, which is really tedious.

Can you clarify in which situation you wouldn't want to abort a downstream operation if the RPC is complete?

@lourd
Copy link
Author

lourd commented Oct 17, 2024

My use case is that the RPC starts up a heavy weight long-running process on the machine. While the process is starting, if the request is aborted, the process should be killed. But once the process has started and the response is sent, it should not be aborted when the RPC completes.

FWIW it was easy enough to implement that behavior, unsubscribing my abort listener before returning, once I understood that ctx.signal will always be aborted.

If you make the call to only have one signal I would request that the name and documentation reflects that behavior clearly, and there is still some way to differentiate whether a request was aborted or simply completed. Right now I'm able to do that in my logger interceptor by checking ctx.signal.aborted in the catch/finally blocks of calling next(request), before returning. I'm assuming that would still be possible in v2?

@lourd
Copy link
Author

lourd commented Oct 17, 2024

I don't understand the assertion in #1253

AFAICT, this implies that at present the server has no universal built-in way to detect when an RPC completes, regardless of reason. Thus resource cleanup cannot occur.

It seems to me like using a finally block like you posted in your earlier possible approaches is a universal built-in way to detect when the RPC completes?

But... I suppose the case that it doesn't is interceptors responding early? grpc/grpc-node#2771 (comment)

can also happen if a server interceptor ends a call on its own

@timostamm
Copy link
Member

Thanks for the input. I wish there was a variant of AbortSignal that isn't tied to cancellation. We're reverting to include completion in the signal in #1282.

You can still exclude completion:

async say(req: SayRequest, ctx) {
  const handle = () => {
    const err = ConnectError.from(ctx.signal.reason);
    // DeadLineExceeded if the deadline passed.
    // Canceled if the client closed the stream (H2).
    err.code; 
  };
  ctx.signal.addEventListener("abort", handle);
  try {
    // ...
    return {
      sentence: "hi",
    };
  } finally {
    ctx.signal.removeEventListener("abort", handle);
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants