Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory growth in 0.12.1 #239

Closed
stj opened this issue Mar 19, 2019 · 3 comments
Closed

Memory growth in 0.12.1 #239

stj opened this issue Mar 19, 2019 · 3 comments

Comments

@stj
Copy link

stj commented Mar 19, 2019

  • uvloop version: 0.12.1
  • Python version: 3.7.1
  • Platform: Ubuntu Linux 18.10
  • Can you reproduce the bug with PYTHONASYNCIODEBUG in env?: yes

Long story short

Our services show a small increase in memory usage over time and at some point get OOM killed.
The service is deployed with gunicorn and uvloop using worker [GunicornUVLoopWebWorker|https://github.com/aio-libs/aiohttp/blob/master/aiohttp/worker.py#L211]

An investigation into the memory usage with tracemalloc shows memory increase in aiohttp/worker.py:124. Using objgraph to show the most common types every 60 seconds we see an instance increase of one for Context, Future, TimerHandle and UVTimer around every second.

We run another test with GunicornWebWorker using asyncio loop and could not reproduce an increased memory usage.

Reading the code this feels like an issue with uvloop call_later or create_future (big guess).

Expected behaviour

Memory usage stays equal regardless of loop used

Actual behaviour

uvloop shows a steady increase in memory usage over asyncio loop.

Steps to reproduce

Create file uvleak.py with

import asyncio
import gc
import pathlib
import os
import time
import tracemalloc

import aiohttp.web
import objgraph

DUMPDIR = pathlib.Path("snapshots") / str(os.getpid())


async def context(app):
    """
    Aiohttp app context function to dump memory snapshots
    every 5 minutes if the :envvar:`PYTHONTRACEMALLOC` is set.
    """
    do_trace = os.getenv("PYTHONTRACEMALLOC")
    if do_trace:
        DUMPDIR.mkdir(parents=True, exist_ok=True)
        task = asyncio.ensure_future(take_snapshot())
    yield
    if do_trace:
        task.cancel()
        tracemalloc.stop()


async def take_snapshot():
    while True:
        await asyncio.sleep(60)
        gc.collect()
        snapshot = tracemalloc.take_snapshot()
        filepath = DUMPDIR / str(time.time())
        snapshot.dump(filepath)
        del snapshot

        filepath = filepath.with_suffix(".objgraph")
        with filepath.open("w") as f:
            objgraph.show_growth(limit=50, file=f)


async def create_app():
    app = aiohttp.web.Application()
    app.cleanup_ctx.append(context)
    return app
pip install aiohttp uvloop objgraph gunicorn
export PYTHONTRACEMALLOC=10
gunicorn uvleak:create_app --max-requests 1000 --bind 0.0.0.0:8080 --workers 1 --worker-class aiohttp.worker.GunicornUVLoopWebWorker

Every 60 seconds objgraph will dump a file ending in .objgraph to snapshots/{pid}/ that shows the memory growth. In my testing the contents are

tuple           4450       +60
method           183       +60
Context          134       +60
TimerHandle      121       +60
UVTimer          121       +60
Future           121       +60

Additional tracemalloc snapshots can be found in the directory too.

Your environment

Ubuntu Linux 18.10
python 3.7.1
aiohttp 3.5.4
uvloop 0.12.1
gunicorn 19.9.0

@1st1
Copy link
Member

1st1 commented Mar 20, 2019

Thank you for a thorough bug report. I'll try to find time to investigate this soon.

1st1 added a commit that referenced this issue Mar 20, 2019
The crux of the problem is that TimerHandle did not clean up a strong
reference from the event loop to `self`.  This typically isn't a problem
unless there's another strong reference to the loop from the callback or
from its arguments (such as a Future).

A few new unit tests should ensure this kind of bugs won't happen again.

Fixes: #239.
@1st1
Copy link
Member

1st1 commented Mar 20, 2019

Thanks to your excellent analysis it was fairly trivial to find the actual bug. I've created a PR with a fix. Once it's green I'll work on releasing 0.12.2 asap. Thank you.

1st1 added a commit that referenced this issue Mar 20, 2019
The crux of the problem is that TimerHandle did not clean up a strong
reference from the event loop to `self`.  This typically isn't a problem
unless there's another strong reference to the loop from the callback or
from its arguments (such as a Future).

A few new unit tests should ensure this kind of bugs won't happen again.

Fixes: #239.
@1st1 1st1 closed this as completed in #240 Mar 20, 2019
1st1 added a commit that referenced this issue Mar 20, 2019
The crux of the problem is that TimerHandle did not clean up a strong
reference from the event loop to `self`.  This typically isn't a problem
unless there's another strong reference to the loop from the callback or
from its arguments (such as a Future).

A few new unit tests should ensure this kind of bugs won't happen again.

Fixes: #239.
@1st1
Copy link
Member

1st1 commented Mar 20, 2019

Fixed in uvloop 0.12.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants