Returning or throwing from a method always triggers an abort #1117

lourd · 2024-06-26T06:05:05Z

Describe the bug

Normally returning or throwing from an RPC method triggers the abort event on the given context's AbortSignal. That's very unexpected. I'm not sure if that's intentional or a bug. Hopefully it's not intended, I don't think it should abort every time no matter what. As it stands now you have to remember to remove every abort listener you may have added to avoid accidental downstream aborting, which is really tedious.

To Reproduce

Environment (please complete the following information):

@connectrpc/connect-web version: n/a
@connectrpc/connect-node version: 1.4.0
Frontend framework and version: n/a
Node.js version: 20.12.1
Browser and version: n/a

// test_service.proto

syntax = "proto3";

package repro.v1;

message FooRequest {
  string name = 1;
}

message FooResponse {}

service TestService {
  rpc Foo(FooRequest) returns (FooResponse);
}

// main.js
import {
  createPromiseClient,
  createRouterTransport,
} from "@connectrpc/connect";

import { TestService } from "./gen/repro/v1/test_service_connect.js";
import { FooRequest, FooResponse } from "./gen/repro/v1/test_service_pb.js";

const transport = createRouterTransport((router) => {
  router.service(TestService, {
    async foo(req, ctx) {
      ctx.signal.addEventListener("abort", () => {
        console.log("ABORTED");
      });
      if (req.name === "throw") {
        throw new Error("thrown!");
      }
      return new FooResponse();
    },
  });
});

const client = createPromiseClient(TestService, transport);
await client.foo(new FooRequest({ name: "test" }));
try {
  await client.foo(new FooRequest({ name: "throw" }));
} catch {}

This will end up printing ABORTED twice, once for the first request when it returns normally and once for the second when it throws

The text was updated successfully, but these errors were encountered:

lourd · 2024-06-26T07:23:30Z

Seeing the call to abort here

connect-es/packages/connect/src/protocol-connect/handler-factory.ts

Lines 271 to 273 in 1d71dc9

    
           } finally { 
        
             context.abort(); 
        
           }

added by @timostamm in #632

srikrsna-buf · 2024-06-26T11:49:58Z

Hey! This was intentional, but you can use the reason property on signal to differentiate whether it was a normal abort or not.

lourd · 2024-06-26T13:27:26Z

Are you open to changing that? That's really confusing and seems like a big pitfall. The case that caused this error for me isn't even aware that it's in a Connect service, it was a couple layers deeper than that and all the sudden is being called after having migrated to Connect from a REST framework. The reason property is also any typed, so comparing that against anything is more of a wish than a guarantee of correctness.

This behavior also isn't documented anywhere. Thinking about how that would be communicated kind of shows how the inherent issue - "The abort signal will be aborted regardless of what happens in your code. It's up to you to check the reason if it was actually aborted or not". Really begs the question of why the library code calling abort() in the first place couldn't do the same thing and not trigger the abort event.

Reading the MDN docs on reason, I really don't think this is proper use. If you don't give a reason when calling abort(), the reason is automatically set to a DOMException. So we would have to check that the reason isn't a DOM exception? This is the behavior in the browser and Node.js

Can you please change this behavior? It appears that the motivation was simply to clear the setTimeout in the deadlineSignal.

connect-es/packages/connect/src/protocol/signals.ts

Line 85 in 1d71dc9

cleanup: () => clearTimeout(timeoutId),

Can we change it back to doing that separately? Perhaps making a cleanup method on context that maps to calling deadline.cleanup, and only call that every time instead of abort?

srikrsna-buf · 2024-06-27T16:27:17Z

The internal cleanup is not the reason for doing this. The signal can be passed to other services and is guaranteed to abort/complete when the rpc is complete (with or without an error). This makes it convenient to pass this signal around to other services. This is particularly useful in fanout scenarios.

Changing this now will certainly be a breaking change, as it changes the behavior. So we can revisit this in v2 (being worked on now). For now, I'd suggest relying on the using keyword added in TS 5.2 (tc39 Stage 3):

const transport = createRouterTransport((router) => {
  router.service(TestService, {
    async foo(req, ctx) {
      using _ = onAbort(signal, () => {
        console.log("ABORTED");
      });    
      if (req.name === "throw") {
        throw new Error("thrown!");
      }
      return new FooResponse();
    },
  });
});

function onAbort(signal: AbortSignal, onAbort: NonNullable<AbortSignal["onabort"]>): Disposable {
  signal.addEventListener("abort", onAbort)
  return {
    [Symbol.dispose]: () =>
      signal.removeEventListener("abort", onAbort)
  }
}

lourd · 2024-06-27T17:04:01Z

Thanks for the thorough response @srikrsna-buf, I appreciate it. That's a good point about the fanout case, having a completion signal that can be used to cancel in process work when one of the workers has finished responding.

Sounds like there's two different cases this one signal is supporting then - aborting and completing. I think it'd be a good idea to separate those into two separate signals. Like I've said, having the abort signal be responsible for both abortion and completion is a pitfall for any code that only expects to be called when things are actually aborted.

The fan-out case can also be enabled by the user making their own abort controller/completion signal and calling that in their own finally, and linking that to the given abort signal. I imagine that's what most people would do these days since there's no documentation about the given abort signal also being called on completion every time. I think that'd be less complicated than example code you've given about using new TS/JS features to remember to remove an abort event listener.

timostamm · 2024-09-09T12:43:04Z

Louis, you're right that HandlerContext.signal takes on two different concerns with cancellation and completion.

Now that AbortSignal.any() has been added to the standard, it has become a bit more intuitive to link signals, and we're going to revisit the behavior for v2.

lourd · 2024-09-09T13:54:27Z

Sounds good, thanks for letting me know @timostamm. What's the plan for v2? Are changes going into main right now on a v1 track or v2?

timostamm · 2024-09-09T15:58:16Z

V2 is mainly for compatibility with protobuf-es v2, but we can make other sensible breaking changes. It lives in the branch v2, and is available as a pre-release.

Once v2 ships, main will be renamed to v1, and v2 will be become main.

timostamm · 2024-09-16T09:12:10Z

To split up the concerns, we could add a separate signal to the handler for completion, and/or a separate combined signal (basically renaming the current HandlerContext.signal).

For example, with a separate done signal:

router.service(ElizaService, {
  async *introduce(req: IntroduceRequest, ctx: HandlerContext) {
    const cancelledOrHandlerDone = AbortSignal.any([ctx.signal, ctx.done]);
    longRunning(cancelledOrHandlerDone);
    yield { sentence: `Hi ${req.name}, I'm eliza` };
  },
});

It would also be possible to add a promise, for example settled:

router.service(ElizaService, {
  async *introduce(req: IntroduceRequest, ctx: HandlerContext) {
    const controller = new AbortController();
    void ctx.settled.then(() => controller.abort(), (reason) => controller.abort(reason));
    longRunning(controller.signal);
    yield { sentence: `Hi ${req.name}, I'm eliza` };
  },
});

But IMO it's most readable to let users use try...finally with their own AbortController:

router.service(ElizaService, {
  async *introduce(req: IntroduceRequest, ctx: HandlerContext) {
    const done = new AbortController();
    try {
      const cancelledOrHandlerDone = AbortSignal.any([ctx.signal, done.signal]);
      longRunning(cancelledOrHandlerDone);
      yield { sentence: `Hi ${req.name}, I'm eliza` };
    } finally {
      done.abort();
    }
  },
});

@srikrsna-buf, what do you think?

srikrsna-buf · 2024-09-16T09:57:59Z

I like the third option, changing the signal to only abort on an error and letting the user deal with completion looks readable to me too.

timostamm · 2024-10-17T12:36:57Z

Re-visiting this for #1253:

My concern is that without a mechanism for detecting doneness in all cases, resource cleanup will be very hard for developers, and they'd have to make their own try/catch wrappers...

A valid concern. @lourd, you originally wrote:

As it stands now you have to remember to remove every abort listener you may have added to avoid accidental downstream aborting, which is really tedious.

Can you clarify in which situation you wouldn't want to abort a downstream operation if the RPC is complete?

lourd · 2024-10-17T15:35:40Z

My use case is that the RPC starts up a heavy weight long-running process on the machine. While the process is starting, if the request is aborted, the process should be killed. But once the process has started and the response is sent, it should not be aborted when the RPC completes.

FWIW it was easy enough to implement that behavior, unsubscribing my abort listener before returning, once I understood that ctx.signal will always be aborted.

If you make the call to only have one signal I would request that the name and documentation reflects that behavior clearly, and there is still some way to differentiate whether a request was aborted or simply completed. Right now I'm able to do that in my logger interceptor by checking ctx.signal.aborted in the catch/finally blocks of calling next(request), before returning. I'm assuming that would still be possible in v2?

lourd · 2024-10-17T15:47:44Z

I don't understand the assertion in #1253

AFAICT, this implies that at present the server has no universal built-in way to detect when an RPC completes, regardless of reason. Thus resource cleanup cannot occur.

It seems to me like using a finally block like you posted in your earlier possible approaches is a universal built-in way to detect when the RPC completes?

But... I suppose the case that it doesn't is interceptors responding early? grpc/grpc-node#2771 (comment)

can also happen if a server interceptor ends a call on its own

timostamm · 2024-10-18T11:44:40Z

Thanks for the input. I wish there was a variant of AbortSignal that isn't tied to cancellation. We're reverting to include completion in the signal in #1282.

You can still exclude completion:

async say(req: SayRequest, ctx) {
  const handle = () => {
    const err = ConnectError.from(ctx.signal.reason);
    // DeadLineExceeded if the deadline passed.
    // Canceled if the client closed the stream (H2).
    err.code; 
  };
  ctx.signal.addEventListener("abort", handle);
  try {
    // ...
    return {
      sentence: "hi",
    };
  } finally {
    ctx.signal.removeEventListener("abort", handle);
  }
}

lourd added the bug Something isn't working label Jun 26, 2024

lourd mentioned this issue Jun 26, 2024

Call cleanup in the finally of each method response instead of abort #1118

Closed

srikrsna-buf mentioned this issue Sep 16, 2024

V2: Don't trigger handler signal on success #1234

Merged

davidfiala mentioned this issue Sep 30, 2024

V2: Ensure that a signal exists for a completed RPC (regardless of error or success) #1253

Closed

timostamm mentioned this issue Oct 18, 2024

V2: Ensure that a signal exists for a completed RPC #1282

Merged

timostamm closed this as completed Oct 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Returning or throwing from a method always triggers an abort #1117

Returning or throwing from a method always triggers an abort #1117

lourd commented Jun 26, 2024

lourd commented Jun 26, 2024

srikrsna-buf commented Jun 26, 2024

lourd commented Jun 26, 2024 •

edited

Loading

srikrsna-buf commented Jun 27, 2024

lourd commented Jun 27, 2024

timostamm commented Sep 9, 2024

lourd commented Sep 9, 2024

timostamm commented Sep 9, 2024

timostamm commented Sep 16, 2024

srikrsna-buf commented Sep 16, 2024

timostamm commented Oct 17, 2024

lourd commented Oct 17, 2024

lourd commented Oct 17, 2024

timostamm commented Oct 18, 2024

Returning or throwing from a method always triggers an abort #1117

Returning or throwing from a method always triggers an abort #1117

Comments

lourd commented Jun 26, 2024

lourd commented Jun 26, 2024

srikrsna-buf commented Jun 26, 2024

lourd commented Jun 26, 2024 • edited Loading

srikrsna-buf commented Jun 27, 2024

lourd commented Jun 27, 2024

timostamm commented Sep 9, 2024

lourd commented Sep 9, 2024

timostamm commented Sep 9, 2024

timostamm commented Sep 16, 2024

srikrsna-buf commented Sep 16, 2024

timostamm commented Oct 17, 2024

lourd commented Oct 17, 2024

lourd commented Oct 17, 2024

timostamm commented Oct 18, 2024

lourd commented Jun 26, 2024 •

edited

Loading