Skip to content

grpc: trailing metadata can't be passed when aborting an AIO server RPC #2065

@cookiefission

Description

@cookiefission

Describe your environment

Python: 3.12.0
opentelemetry-distro[otlp]==0.42b0
opentelemetry-instrumentation-grpc==0.42b0
grpcio==1.59.2

Steps to reproduce
Full replication can be found here.

server.py:

import asyncio
import logging

import grpc
import helloworld_pb2
import helloworld_pb2_grpc


class Greeter(helloworld_pb2_grpc.GreeterServicer):
    async def SayHello(
        self,
        request: helloworld_pb2.HelloRequest,
        context: grpc.aio.ServicerContext,
    ) -> helloworld_pb2.HelloReply:
        metadata = (
            ("this", "should"),
            ("work", "nicely")
        )
        await context.abort(code=grpc.StatusCode.ABORTED, details="This is the actual error message", trailing_metadata=metadata)


async def serve() -> None:
    server = grpc.aio.server()
    helloworld_pb2_grpc.add_GreeterServicer_to_server(Greeter(), server)
    listen_addr = "[::]:50051"
    server.add_insecure_port(listen_addr)
    logging.info("Starting server on %s", listen_addr)
    await server.start()
    await server.wait_for_termination()


if __name__ == "__main__":
    logging.basicConfig(level=logging.INFO)
    asyncio.run(serve())

This server is just adopted from the grpc examples. The helloworld protobuf artefacts can be found there.

For a client, the async_greeter_client.py from grpc examples can be used.

This only happens when trailing_metadata is passed as an argument to context.abort.

What is the expected behavior?

Running the server without opentelemetry, the (expected) error from running the client is:

grpc.aio._call.AioRpcError: <AioRpcError of RPC that terminated with:
        status = StatusCode.ABORTED
        details = "This is the actual error message"
        debug_error_string = "UNKNOWN:Error received from peer  {grpc_message:"This is the actual error message", grpc_status:10, created_time:"2023-11-16T15:09:03.552303+00:00"}"
>

When running the server under opentelemetry-instrument, I expect the same output as when it is run without OTEL.

What is the actual behavior?

Instead, an exception is still raised but the underlying code, details, and trailing_metadata are missing. This makes any nuanced error handling on the client side impossible.

grpc.aio._call.AioRpcError: <AioRpcError of RPC that terminated with:
        status = StatusCode.UNKNOWN
        details = "Unexpected <class 'TypeError'>: _OpenTelemetryServicerContext.abort() got an unexpected keyword argument 'trailing_metadata'"
        debug_error_string = "UNKNOWN:Error received from peer  {created_time:"2023-11-16T15:30:02.017999+00:00", grpc_status:2, grpc_message:"Unexpected <class \'TypeError\'>: _OpenTelemetryServicerContext.abort() got an unexpected keyword argument \'trailing_metadata\'"}"
>

Additional context

The abort method for grpc.ServicerContext and grpc.aio.ServicerContext is different. It's unclear why. The AIO instrumentation re-uses the _OpenTelemetryServicerContext from the non-async server instrumentation which is how this bug snuck in, despite there being a test in place for the abort.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions