Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support async/await syntax #62

Closed
reyang opened this issue Jul 25, 2019 · 46 comments
Closed

Support async/await syntax #62

reyang opened this issue Jul 25, 2019 · 46 comments

Comments

@reyang
Copy link
Member

reyang commented Jul 25, 2019

Python started to support async/await syntax in version 3.7.
We want to explore how to support this in OpenTelemetry Python.
Things to be explored are:

  1. Do we provide this in the API package or API+SDK?
  2. How do we test? If test cases are written using the async/await syntax, they won't work in Python <= v3.6, we might need to collect test cases based on runtime version.
@c24t
Copy link
Member

c24t commented Jul 25, 2019

Do we provide this in the API package or API+SDK?

If we can solve the <3.6 problem I think it'd be great to include both sync and async versions of some methods.

What do we lose by using asyncio.coroutine instead of async/await?

@reyang
Copy link
Member Author

reyang commented Jul 26, 2019

I haven't got time to explore this so I don't know the answer at this moment. Just list these questions so I won't forget about them :)

From my memory, __aenter__ and __aexit__ should be supported starting from Python 3.5, and it wouldn't do any harm (unless we want to enforce 100% code coverage) to have them for Python 3.4.
In OpenCensus we do have a minor issue when we lint examples, since the lint engine is running under Python 3.6, if we use async/await syntax we will get SyntaxError. Here in OpenTelemetry I think we shouldn't have the problem.

@codeboten
Copy link
Contributor

Closing this for now, re-open if it becomes an issue in the future.

@codeboten
Copy link
Contributor

Currently we're only supporting python 3.6+ so the concerns around breaking tests should be resolved. @thedrow re-opening, would love some help implementing this if you have some bandwidth.

@trollkarlen
Copy link

No news on this topic ?

@owais
Copy link
Contributor

owais commented Oct 14, 2021

Dumb question: What does it mean to support async/await exactly? Are we talking about instrumentations working with async/await functions (e.g, tornado request handlers), context propation working with async/await (if it doesn't already) or do we expect async/await versions of some API methods and why?

@ocelotl
Copy link
Contributor

ocelotl commented Oct 14, 2021

Dumb question: What does it mean to support async/await exactly? Are we talking about instrumentations working with async/await functions (e.g, tornado request handlers), context propation working with async/await (if it doesn't already) or do we expect async/await versions of some API methods and why?

I have the exact same question as well. I find it a bit weird that we are to add anything to our implementation without it being in the spec first. Is it there already?

I can understand that the idea is for an implementation to do some preliminary research of a certain feature, but if that is the case it would be very helpful to have also a preliminary specification (it can be drafted in this issue, I don't mean it to be in the spec repo) so that we have an agreement and understanding on what is being attempted here.

@owais
Copy link
Contributor

owais commented Oct 17, 2021

OK. That clarifies it a bit more. So this does not apply to the API specifically. What we want are async versions of BatchSpanProcessor and probably core exporters. Is that right?

@owais
Copy link
Contributor

owais commented Oct 18, 2021

It would be great if someone interested in this could take this up and produce a design doc or proposal with technical details describing all possible directions we can take with pros and cons weighed so we can move forward on this. Ideally it'd be great if we could support both sync and async use cases with same methods but not sure how easily that can be done or even if possible.

@Alan-R
Copy link

Alan-R commented Oct 25, 2021

Another aspect of this: Each asyncio task needs to have its own root context span. Currently when you create a span, it assumes that if there's a current span active, that this should be the span to have as the parent span. In asyncio, this is nearly always wrong.

@aabmass
Copy link
Member

aabmass commented Jan 26, 2022

OK. That clarifies it a bit more. So this does not apply to the API specifically. What we want are async versions of BatchSpanProcessor and probably core exporters. Is that right?

I've been thinking about this a bit for metrics as well. I think we would want to be able to create Observable instruments with async callbacks too, which requires some API change. This could probably be done by only updating the Meter API's type annotations to accept async or regular forms, and then updating the SDK to handle these.

Currently when you create a span, it assumes that if there's a current span active, that this should be the span to have as the parent span. In asyncio, this is nearly always wrong.

@Alan-R can you explain this a bit more? I thought our context implementation which uses contextvars should work correctly for asyncio in most cases.

@Alan-R
Copy link

Alan-R commented Apr 10, 2022 via email

@gen-xu
Copy link
Contributor

gen-xu commented May 22, 2022

Any news about this?
it would also be nice to have async tracer as well so we can do something like this

async with tracer.start_span_async("span"):
   pass

@TBBle
Copy link
Contributor

TBBle commented Jun 8, 2022

@gen-xu What do you have in mind for start_span_async to actually do? I don't think start_span does anything that would need to have an async version (although I guess a SpanProcessor could change that, but an AsyncSpanProcessor would be needed to address that, with much wider repercussions).

@gen-xu
Copy link
Contributor

gen-xu commented Jun 9, 2022

@TBBle yes if we have AsyncSpanProcessor then we might need start_span_async, but there are some other places I see a need with start_span_async:

opentelemetry.sdk.trace.Span has a _lock attribute, while in the world of asyncio you don't really need a threading.Lock. Also comparing with asyncio.Lock, it is much lightweight and faster than threading.Lock.

The current implementation of tracer.start_as_current_span is a bit expensive blocking call, and frequently starting spans introduce overhead that hurts performance of asyncio event loop. The __enter__ and __exit__ of the Span on their own already take ~250us on my machine.

@gshpychka
Copy link

The __enter__ and __exit__ of the Span on their own already take ~250us on my machine.

Thanks for this number. Can you provide a bit more info about the machine and setup? The benchmark tests suggest it should be 50us at most (20k iterations/sec). Or maybe I misunderstood how those benchmarks are supposed to be interpreted.

@TBBle
Copy link
Contributor

TBBle commented Jun 10, 2022

The things protected by Span._lock (AFAICT) would not make sense to protect with an asyncio.Lock because there's nothing that would await, so there's no contention possible. So what you're actually looking for there is a "SingleThreadedSpan" (or a single_threaded parameter which could just make self._lock be a no-op context manager), which would not need async with.

@aabmass
Copy link
Member

aabmass commented Jun 10, 2022

The current implementation of tracer.start_as_current_span is a bit expensive blocking call, and frequently starting spans introduce overhead that hurts performance of asyncio event loop. The __enter__ and __exit__ of the Span on their own already take ~250us on my machine.

If you're using BatchSpanProcessor and one of the built-in samplers, this should be CPU bound work + enqueing the message in a queue.Queue. Unless you have a lot of threads concurrently creating spans, I don't think non-blocking await would help here, as it's CPU bound.

On my machine:

$ python3 -m timeit -s "from threading import Lock; l = Lock()" "with l: pass"
2000000 loops, best of 5: 163 nsec per loop

acquiring the locks shouldn't be too expensive if there isn't lock contention. Since you mention asyncio, I'm assuming you don't have many threads.

I'm trying to think of what async work might be done if we introduced a start_span_async(). One use case I can think of is remote sampling

@gen-xu
Copy link
Contributor

gen-xu commented Jun 12, 2022

@gshpychka Thanks for the link, didn't know there was a benchmark page available. I think the numbers I gave were not accurate, which are based on the viztracer library I used. I think it's just the viztracer's overhead and I re-did again myself with simply time() and it gives numbers like 50~60 us. Sorry for the false positive.

@TBBle
Copy link
Contributor

TBBle commented Jun 18, 2022

I'm not sure if it should be a separate ticket, but working through how to deliver observable instruments using an async co-routine might be interesting.

I haven't really done this thought experiment myself, and so don't know if it's even feasible, I only raise it because a colleague was looking into adding an observable guage to our asyncio-based code, and had to use asyncio.run_coroutine_threadsafe to block the exporter thread to be able to call back into our event loop, so perhaps just supporting that as a third callback type (where we already have functions and generator) is the best we can do?

Looking at the anyio equivalent, it's possible that there isn't a good solution, as the callback needs too much event loop info or pre-work, so doing what my colleage has currently done in user-code may be our best option.

    def message_count_callback():        
        coro = message_source.consumer_count()
        future = asyncio.run_coroutine_threadsafe(coro, loop)        
        message_count = future.result()        
        yield Observation(message_count)

@aabmass
Copy link
Member

aabmass commented Jun 21, 2022

and had to use asyncio.run_coroutine_threadsafe to block the exporter thread to be able to call back into our event loop, so perhaps just supporting that as a third callback type (where we already have functions and generator) is the best we can do?

This is basically what I had in mind. We can accept an optional event loop in the MeterProvider (if not provided create a background thread with a loop running for all of OpenTelemetry to use) and run any coroutine callbacks in that loop. The benefit of doing it in the SDK vs user code is that we can run the async callbacks for all instruments in parallel.

Looking at the anyio equivalent, it's possible that there isn't a good solution, as the callback needs too much event loop info or pre-work, so doing what my colleage has currently done in user-code may be our best option.

I'm not super familiar with anyio, how important is it to everyone that we use anyio instead of just using asyncio here? We are usually pretty conservative about taking new dependencies

@TBBle
Copy link
Contributor

TBBle commented Jun 22, 2022

At least for anyio, I thought it was already being used in the SDK for something, but I just looked and it's not; I must have been thinking of a different project, sorry. I see we're directly using asyncio in the tests and docs, but the SDK code itself has no async at all.

I will note that it's been a pretty-common sight in other libraries I've used that have to deal with async but aren't asyncio-specific, but I haven't really looked into it, or had to deal with it directly myself. I have no idea how widespread non-asyncio event loops, e.g. trio, are in practice, so I can't really comment on the cost/value tradeoff of using it.

@Corfucinas
Copy link

Is it possible to add to the documentation that @tracer.start_as_current_span('my_span') only works for sync functions only? Since it doesn't emit an error it's a bit confusing when a span doesn't show up

@gen-xu
Copy link
Contributor

gen-xu commented Oct 5, 2022

Is it possible to add to the documentation that @tracer.start_as_current_span('my_span') only works for sync functions only? Since it doesn't emit an error it's a bit confusing when a span doesn't show up

+1

Also I am using this snippet myself to work around that, might be helpful to others

from contextlib import asynccontextmanager

@asynccontextmanager
async def start_as_current_span_async(tracer, *args, **kwargs):
    with tracer.start_as_current_span(*args, **kwargs) as span:
        yield span

and so it can be used as

@start_as_current_span_async(tracer, "my_span")
async def some_async_func():
  ...

@Corfucinas
Copy link

Is it possible to add to the documentation that @tracer.start_as_current_span('my_span') only works for sync functions only? Since it doesn't emit an error it's a bit confusing when a span doesn't show up

+1

Also I am using this snippet myself to work around that, might be helpful to others

from contextlib import asynccontextmanager

@asynccontextmanager
async def start_as_current_span_async(tracer, *args, **kwargs):
    with tracer.start_as_current_span(*args, **kwargs) as span:
        yield span

and so it can be used as

@start_as_current_span_async(tracer, "my_span")
async def some_async_func():
  ...

By looking at it this should work as-is for a wrapper for any function (sync/async), right?

@gen-xu
Copy link
Contributor

gen-xu commented Oct 5, 2022

Is it possible to add to the documentation that @tracer.start_as_current_span('my_span') only works for sync functions only? Since it doesn't emit an error it's a bit confusing when a span doesn't show up

+1
Also I am using this snippet myself to work around that, might be helpful to others

from contextlib import asynccontextmanager

@asynccontextmanager
async def start_as_current_span_async(tracer, *args, **kwargs):
    with tracer.start_as_current_span(*args, **kwargs) as span:
        yield span

and so it can be used as

@start_as_current_span_async(tracer, "my_span")
async def some_async_func():
  ...

By looking at it this should work as-is for a wrapper for any function (sync/async), right?

I don't think the wrapper (if you mean the start_as_current_span_async) will work for sync function, as it wraps as a coroutine in the end, but yes it should work for any async function

@TBBle
Copy link
Contributor

TBBle commented Oct 10, 2022

This seems like valuable low-hanging fruit to add as tracer.start_as_current_span_async or similar. That way the documentation wouldn't be "Don't use this for async", but "Use this alternative decorator with async instead".

It might even be feasible for tracer.start_as_current_span and tracer.start_as_current_span_async to recognise when they're used to decorate the wrong kind of function, but I recall async-coroutine detection to be somewhat woolly and easy to fool.

@Alan-R
Copy link

Alan-R commented Oct 11, 2022 via email

@cicada-chen
Copy link

Is it possible to add to the documentation that @tracer.start_as_current_span('my_span') only works for sync functions only? Since it doesn't emit an error it's a bit confusing when a span doesn't show up

+1

Also I am using this snippet myself to work around that, might be helpful to others

from contextlib import asynccontextmanager

@asynccontextmanager
async def start_as_current_span_async(tracer, *args, **kwargs):
    with tracer.start_as_current_span(*args, **kwargs) as span:
        yield span

and so it can be used as

@start_as_current_span_async(tracer, "my_span")
async def some_async_func():
  ...

I use this snippet and some error occurred:
TypeError: '_AsyncGeneratorContextManager' object is not callable
Any ideas? thanks

@decaf-addict
Copy link

Is it possible to add to the documentation that @tracer.start_as_current_span('my_span') only works for sync functions only? Since it doesn't emit an error it's a bit confusing when a span doesn't show up

+1
Also I am using this snippet myself to work around that, might be helpful to others

from contextlib import asynccontextmanager

@asynccontextmanager
async def start_as_current_span_async(tracer, *args, **kwargs):
    with tracer.start_as_current_span(*args, **kwargs) as span:
        yield span

and so it can be used as

@start_as_current_span_async(tracer, "my_span")
async def some_async_func():
  ...

I use this snippet and some error occurred: TypeError: '_AsyncGeneratorContextManager' object is not callable Any ideas? thanks

Not sure if this is the best way, but I was able to get around the error with this,

from contextlib import _AsyncGeneratorContextManager

def customasynccontextmanager(func):
    @wraps(func)
    def helper(*args, **kwds):
        return _CustomAsyncGeneratorContextManager(func, args, kwds)

    return helper


class _CustomAsyncGeneratorContextManager(_AsyncGeneratorContextManager):
    def __call__(self, func):
        @wraps(func)
        async def inner(*args, **kwds):
            async with self.__class__(self.func, self.args, self.kwds):
                return await func(*args, **kwds)

        return inner


@customasynccontextmanager
async def start_as_current_span_async(*args, **kwargs):
    with tracer.start_as_current_span(*args, **kwargs) as span:
        yield span

courtesy of python/cpython@86b833b

@Corfucinas
Copy link

Corfucinas commented Jan 9, 2023

Is it possible to add to the documentation that @tracer.start_as_current_span('my_span') only works for sync functions only? Since it doesn't emit an error it's a bit confusing when a span doesn't show up

+1
Also I am using this snippet myself to work around that, might be helpful to others

from contextlib import asynccontextmanager

@asynccontextmanager
async def start_as_current_span_async(tracer, *args, **kwargs):
    with tracer.start_as_current_span(*args, **kwargs) as span:
        yield span

and so it can be used as

@start_as_current_span_async(tracer, "my_span")
async def some_async_func():
  ...

I use this snippet and some error occurred: TypeError: '_AsyncGeneratorContextManager' object is not callable Any ideas? thanks

Not sure if this is the best way, but I was able to get around the error with this,

from contextlib import _AsyncGeneratorContextManager

def customasynccontextmanager(func):
    @wraps(func)
    def helper(*args, **kwds):
        return _CustomAsyncGeneratorContextManager(func, args, kwds)

    return helper


class _CustomAsyncGeneratorContextManager(_AsyncGeneratorContextManager):
    def __call__(self, func):
        @wraps(func)
        async def inner(*args, **kwds):
            async with self.__class__(self.func, self.args, self.kwds):
                return await func(*args, **kwds)

        return inner


@customasynccontextmanager
async def start_as_current_span_async(*args, **kwargs):
    with tracer.start_as_current_span(*args, **kwargs) as span:
        yield span

courtesy of python/cpython@86b833b

It's better for the library to handle it internally.....this would avoid massive imports or just copy pasting this shortcut ...but in the meantime....guess it works 🤝

@aabmass
Copy link
Member

aabmass commented Apr 19, 2023

I think there are quite a few feature requests in this thread and some discussion that just confuses users. I would like to close this issue since it is a bit confusing and open separate issues for those, if that sounds OK to everyone.

  • Make tracer.start_as_current_span() decorator work with async functions Support async/await syntax #62 (comment)
    • Alternatives:
      • Document that it doesn't work. Adding a note doesn't hurt anyone, but I think the type annotations are pretty clear here and a type checker should catch this issue as well.
      • Add a separate tracer.start_as_current_span_async() Support async/await syntax #62 (comment). IMO this is confusing because it would only be a decorator unlike start_as_current_span().
  • Trace support for async with tracer.start_span_async("span"): ... Support async/await syntax #62 (comment)
    • I think we determined in comments that almost no-one needs this right now as you should be using BatchSpanProcessor to avoid making your instrumentation block (details here).
    • This would enable async implementations of SpanProcessors (not so useful IMO as SpanProcessor work should be fire-and-forget) and Samplers (may be useful for remote sampling).
  • Support async callbacks for asynchronous instruments 😃 Support async/await syntax #62 (comment)
  • Support for async exporters 😃

I'll throw another one in

  • Use an event loop to manage exporting intervals in BatchSpanProcessor and PeriodicExportingMetricReader
    • We could avoid creating a new background thread in each instance of those classes and simplify the synchronization logic. Would work nicely with async exporters and async callbacks which could all run in the same event loop.

@aabmass
Copy link
Member

aabmass commented Apr 20, 2023

Please open new issues if I missed anything. I'm closing this as obsolete in favor of the child issues:

@aabmass aabmass closed this as not planned Won't fix, can't repro, duplicate, stale Apr 20, 2023
@jamestrousdale
Copy link

Is it possible to add to the documentation that @tracer.start_as_current_span('my_span') only works for sync functions only? Since it doesn't emit an error it's a bit confusing when a span doesn't show up

+1
Also I am using this snippet myself to work around that, might be helpful to others

from contextlib import asynccontextmanager

@asynccontextmanager
async def start_as_current_span_async(tracer, *args, **kwargs):
    with tracer.start_as_current_span(*args, **kwargs) as span:
        yield span

and so it can be used as

@start_as_current_span_async(tracer, "my_span")
async def some_async_func():
  ...

I use this snippet and some error occurred: TypeError: '_AsyncGeneratorContextManager' object is not callable Any ideas? thanks

I wonder if the snippet might either have problems on earlier Python versions, or because the Span is being yielded. I don't think it's necessary to do so.

On Python 3.11, this works for me perfectly (with type hints too - I spent a couple of hours trying and failing to get the longer solution using _AsyncGeneratorContextManager to work with mypy...)

from collections.abc import AsyncGenerator
from contextlib import asynccontextmanager
from typing import Any

from opentelemetry.trace import Tracer


@asynccontextmanager
async def start_as_current_span_async(
    *args: Any,
    tracer: Tracer,
    **kwargs: Any,
) -> AsyncGenerator[None, None]:
    """Start a new span and set it as the current span.

    Args:
        *args: Arguments to pass to the tracer.start_as_current_span method
        tracer: Tracer to use to start the span
        **kwargs: Keyword arguments to pass to the tracer.start_as_current_span method

    Yields:
        None
    """
    with tracer.start_as_current_span(*args, **kwargs):
        yield

I was then able to use this directly like

from opentelemetry.trace import get_tracer

tracer = get_tracer(__name__)


@start_as_current_span_async(tracer=tracer, name='my_func')
async def my_func() -> ...:
    ...

Both mypy and the actual runtime were happy with this solution.

@ocelotl
Copy link
Contributor

ocelotl commented May 23, 2023

@jamestrousdale this seems like valuable information, can you please add it to the issue above where it fits better? This is a closed issue and I don't want this comment of yours to be overlooked because of that.

@nesb1
Copy link

nesb1 commented Sep 11, 2023

@jamestrousdale nice solution. But it can be more flexible if you add function as argument:

def span_async_function(
    tracer: Tracer,
    name: str,
    context: Context | None = None,
    kind: SpanKind = SpanKind.INTERNAL,
    attributes: types.Attributes = None,
    links: _Links = None,
    start_time: int | None = None,
    record_exception: bool = True,
    set_status_on_exception: bool = True,
    end_on_exit: bool = True,
):
    
    def decorator(function: Callable[..., Awaitable[Any]]):
        @wraps(function)
        async def wrapper(*args, **kwargs):
            with tracer.start_as_current_span(
                name=name,
                context=context,
                kind=kind,
                attributes=attributes,
                links=links,
                start_time=start_time,
                record_exception=record_exception,
                set_status_on_exception=set_status_on_exception,
                end_on_exit=end_on_exit,
            ):
                return await function(*args, **kwargs)

        return wrapper

    return decorator

With this code it can be possible to decorate function not only with decorator syntax:

async def my_fucntion():
   ...
func_with_trace = span_async_fucntion(...)(my_func)

@cyberbudy
Copy link

I've encountered a slowdown on my async application after including opentelemetry(previously used opencensus)

Is there any way to use opentelemetry without locking in asyncio environment?

@rolyv
Copy link

rolyv commented Jan 17, 2024

I also encountered a slowdown on my async application by including opentelemetry. I have an application with a lot of async tasks running. Response latency improved by 25% just by removing all opentelemetry. I came up with the hypothesis that maybe it was some lock contention when starting a span as the current span. I decided to refactor my application to not use the start_as_current_span method and instead I would try manually propagating the context down the call stack as an additional parameter in all my methods and use the start_span(name, Context) method to create "detached" spans instead. I was able to add telemetry back to the application with no measurable impact on the response times. I know this is just anecdotal evidence, but I hope it's enough for someone to consider doing a proper benchmark on the impact of the global context lock with asyncio. Just my 2 cents

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests