Remove cache in iscoroutinefunction to avoid holding on to refs #5985

fjetter · 2022-03-23T14:15:35Z

This is currently suspected to be one of the reasons why our CI is behaving very awkwardly, e.g. why sometimes GC takes so long to run. We couldn't prove this but we know for a fact that this causes problems by accumulating DequeHandlers, see #5973

I believe we should find a method which does not hold on to references if we indeed want to cache this. However, introducing any mechanic in here that would ensure us not holding on to references is likely much slower than simply using the inspect call.

The cache was introduced in #4481 and basically reverts #4469

We don't have any benchmarks about this I could run to verify the impact. Generally, I'm very surprised to see this taking so long. Based on a microbenchmark this is very fast

Python 3.8.12 | packaged by conda-forge | (default, Jan 30 2022, 23:13:24)
Type 'copyright', 'credits' or 'license' for more information
IPython 8.0.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from tornado import gen

In [2]: from distributed.utils import iscoroutinefunction

In [3]: class A:
   ...:     async def method(self):
   ...:         await asyncio.sleep(0)
   ...:         return
   ...:
   ...:     @gen.coroutine
   ...:     def gen_coro(self):
   ...:         yield gen.moment()

In [4]: %timeit iscoroutinefunction(A.method)
434 ns ± 1.74 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [5]: %timeit iscoroutinefunction(A.gen_coro)
513 ns ± 2.64 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [6]: import inspect

In [7]: %timeit inspect.iscoroutinefunction(A.method)
395 ns ± 1.76 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

Closes the "distributed.worker" logger gains ~thousand DequeHandler instances during a pytest run #5973

mrocklin · 2022-03-23T14:20:44Z

Fine by me. If we need it in the future we can cache by function name or something else. Happy to wait until it's proven necessary though and proceed optimistically.

graingert · 2022-03-23T15:48:40Z

I was hoping this would also avoid: https://github.com/dask/distributed/runs/5661844951?check_suite_focus=true#step:12:1348

--- Logging error ---
Traceback (most recent call last):
  File "/usr/share/miniconda3/envs/dask-distributed/lib/python3.9/logging/__init__.py", line 1086, in emit
    stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
  File "/home/runner/work/distributed/distributed/distributed/utils_perf.py", line 199, in _gc_callback
    logger.warning(
Message: 'full garbage collections took %d%% CPU time recently (threshold: %d%%)'
Arguments: (36.035444889464394, 10.0)

github-actions · 2022-03-23T19:32:25Z

Unit Test Results

      12 files +      12       12 suites +12 6h 0m 54s ⏱️ + 6h 0m 54s
  2 669 tests +  2 669   2 587 ✔️ +  2 587   81 💤 +  81 0 ❌ ±0 1 🔥 +1
15 926 runs +15 926 15 066 ✔️ +15 066 858 💤 +858 1 ❌ +1 1 🔥 +1

For more details on these errors, see this check.

Results for commit 3ef2f5c. ± Comparison against base commit b872d45.

seanlaw · 2022-05-03T13:27:36Z

I was hoping this would also avoid: https://github.com/dask/distributed/runs/5661844951?check_suite_focus=true#step:12:1348

--- Logging error ---
Traceback (most recent call last):
  File "/usr/share/miniconda3/envs/dask-distributed/lib/python3.9/logging/__init__.py", line 1086, in emit
    stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
  File "/home/runner/work/distributed/distributed/distributed/utils_perf.py", line 199, in _gc_callback
    logger.warning(
Message: 'full garbage collections took %d%% CPU time recently (threshold: %d%%)'
Arguments: (36.035444889464394, 10.0)

@graingert I just came across this as well. Do you know if this is something to be concerned with or is there another issue that references this?

Remove cache in iscoroutinefunction to avoid holding on to refs

3ef2f5c

fjetter mentioned this pull request Mar 23, 2022

Avoid is_coroutine_function calls in Server #4469

Open

graingert approved these changes Mar 23, 2022

View reviewed changes

fjetter merged commit 9d3064a into dask:main Mar 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove cache in iscoroutinefunction to avoid holding on to refs #5985

Remove cache in iscoroutinefunction to avoid holding on to refs #5985

fjetter commented Mar 23, 2022

mrocklin commented Mar 23, 2022

graingert commented Mar 23, 2022

github-actions bot commented Mar 23, 2022

seanlaw commented May 3, 2022

Remove cache in iscoroutinefunction to avoid holding on to refs #5985

Remove cache in iscoroutinefunction to avoid holding on to refs #5985

Conversation

fjetter commented Mar 23, 2022

mrocklin commented Mar 23, 2022

graingert commented Mar 23, 2022

github-actions bot commented Mar 23, 2022

Unit Test Results

seanlaw commented May 3, 2022