Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graceful shutdown sequence doesn't terminate the existing TCP connections gracefully #3638

Closed
dwc147896325 opened this issue Mar 6, 2019 · 15 comments
Labels
bug need pull request regression Something that used to work stopped working "as before" after upgrade reproducer: missing This PR or issue lacks code, which reproduce the problem described or clearly understandable STR reproducer: present This PR or issue contains code, which reproduce the problem described or clearly understandable STR server

Comments

@dwc147896325
Copy link

dwc147896325 commented Mar 6, 2019

Long story short

server code as below:

import asyncio
from aiohttp import web

 
async def _on_shutdown(app):
    await asyncio.sleep(6)


async def _handler(request):
    await asyncio.sleep(5)
    return web.Response(body=b'yes')


def run():
    app = web.Application()
    app.router.add_get('/{path:.*}', _handler)
    app.on_shutdown.append(_on_shutdown)
    web.run_app(app)


if __name__ == '__main__':
    run()

Expected behaviour

graceful shutdown should terminate the request processing before stopping the server process, canceling the existing requests processing coroutines.

Steps to reproduce

  1. start the server process
  2. start the client process, e.g. curl http://localhost:8080/get
  3. send SIGTERM or SIGINT to the server process
  4. client reports aiohttp.client_exceptions.ServerDisconnectedError before the server shutdown

Your environment

Python 3.7.1, Centos 7.2.1511, aiohttp 3.5.4

@aio-libs-bot
Copy link

GitMate.io thinks the contributor most likely able to help you is @asvetlov.

Possibly related issues are #1369 (Graceful shutdown can (I think) truncate connections that it shouldn't), #1486 (non-working example of the documentation graceful-shutdown - KeyError: websockets), #183 (Streaming uploads doesn't seem to work), #3075 (trust_env doesn't works), and #2298 (graceful shutdown or reload with gunicorn without shutdown connections).

@webknjaz
Copy link
Member

webknjaz commented Mar 6, 2019

https://aiojobs.readthedocs.io

@dwc147896325
Copy link
Author

https://aiojobs.readthedocs.io

does not work 2

@webknjaz
Copy link
Member

webknjaz commented Mar 6, 2019

You did something wrong

@amitbl
Copy link

amitbl commented Mar 23, 2019

@dwc147896325 You did nothing wrong.
That's a bug in aiohttp since this commit: dd30b2a#diff-8e4ccf3aaf43100064462a4df10952e5R214

For anybody having this issue, revert your aiohttp version to 3.4.4 which doesn't suffer from this bug or apply this fix I've put together for the latest version: amitbl@8d42315

@amitbl amitbl mentioned this issue Mar 23, 2019
5 tasks
@johnathaningle
Copy link

I was having the same issue. The graceful shutdown can be fixed by changing the following code in web.py
image

@asvetlov
Copy link
Member

Do you use Windows?

@johnathaningle
Copy link

johnathaningle commented May 28, 2019 via email

@pyrolynx
Copy link

pyrolynx commented Mar 10, 2020

@asvetlov
I have a suggestion to run on_shutdown before asyncio.server was closed.
May be fix BaseSite.stop method at aiohttp/web_runner.py:64
image

Another case, this fix solve problem of Graceful shutdown with healthchecks (Consul, Docker, k8s).

@KonradSwierczynski
Copy link

Yea, it would be nice to run on_shutdown before closing the server.

Also, the logic seems to be a little bit jumbled. As I see, when runner.cleanup is called, it calls site.site, which then calls runner.shutdown (and priority of on_shutdown over on_cleanup is now just an implementation detail).

Such a fix would enable graceful shutdown and, for example, rolling updates on k8s.

What do you think? @asvetlov

@KonradSwierczynski
Copy link

@dwc147896325

I tried to implement a graceful shutdown using documentation, but it does not work, just like it is described in the issue.
I created a workaround implementing signal handling myself. There is no programmatically way to close the server, only sending SIGINT can do it.

import asyncio
from aiohttp import web

import os
import signal
from threading import Thread
import time


def _on_shutdown(pid):
    time.sleep(6)

    #  Sending SIGINT to close server
    os.kill(pid, signal.SIGINT)


async def _graceful_shutdown_ctx(app):
    def graceful_shutdown_sigterm_handler():
        nonlocal thread
        thread = Thread(target=_on_shutdown, args=(os.getpid(),))
        thread.start()

    thread = None

    loop = asyncio.get_event_loop()
    loop.add_signal_handler(
        signal.SIGTERM, graceful_shutdown_sigterm_handler,
    )

    yield

    loop.remove_signal_handler(signal.SIGTERM)

    if thread is not None:
        thread.join()


async def _handler(request):
    await asyncio.sleep(5)
    return web.Response(text='yes')


def run():
    app = web.Application()
    app.router.add_get('/', _handler)
    app.cleanup_ctx.append(_graceful_shutdown_ctx)
    web.run_app(app, handle_signals=False)


if __name__ == '__main__':
    run()

@webknjaz
Copy link
Member

webknjaz commented Apr 24, 2020

@KonradSwierczynski it works on the server side if you set handle_signals=True and your loop has add_signal_handler: https://github.com/aio-libs/aiohttp/blob/7951bca/aiohttp/web_runner.py#L232-L237. (But yes, there may be an issue with the sequence of killing client connections, which is probably not what you're experiencing)

@webknjaz webknjaz changed the title Graceful shutdown does not work Graceful shutdown sequence doesn't terminate the existing TCP connections gracefully Apr 24, 2020
@webknjaz webknjaz added need pull request regression Something that used to work stopped working "as before" after upgrade reproducer: missing This PR or issue lacks code, which reproduce the problem described or clearly understandable STR reproducer: present This PR or issue contains code, which reproduce the problem described or clearly understandable STR server labels Apr 24, 2020
@farin
Copy link

farin commented Apr 29, 2020

Another workaround for it.

GracefulExit can be used to trigger original signal_handler functinality.
Unfortunatelly when raise, all running coroutines are cancelled. So solution is
running shutdown task and then trigger signal for second time and raise GracefulExit here

To be correct it's also good idea not accept new connections during shutdown hook is running (and probably closing existing websocket connections in common use case)

import os
import asyncio
from aiohttp.web_runner import GracefulExit

async def shutdown_hook(app, sig):
    print("shutdown_hook, implement your graceful close here")
    # just sleep as placeholder
    await asyncio.sleep(1)
    print(".")
    await asyncio.sleep(1)
    print("...")
    # eventually retrigger signal to deliver GracefulExit to server
    os.kill(os.getpid(), sig)


async def app_ctx(app):
   # start up init here

   shutdown = False

   def signal_handler(sig):
        nonlocal shutdown
        if shutdown:
            raise GracefulExit()
        else:
            shutdown = True  # flag can be also used to stop accepting new websocket connections
            asyncio.create_task(shutdown_hook(app, sig))

   asyncio.get_event_loop().add_signal_handler(signal.SIGINT, signal_handler, signal.SIGINT)
   asyncio.get_event_loop().add_signal_handler(signal.SIGTERM, signal_handler, signal.SIGTERM)

    yield

    # cleanup here ..

app.cleanup_ctx.append(app_ctx)

@bkad
Copy link

bkad commented Nov 17, 2022

I couldn't reproduce this in v3.8.3.

@Dreamsorcerer
Copy link
Member

This was resolved a while ago and improved further in 3.9, with the shutdown steps documented:
https://docs.aiohttp.org/en/stable/web_advanced.html#graceful-shutdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug need pull request regression Something that used to work stopped working "as before" after upgrade reproducer: missing This PR or issue lacks code, which reproduce the problem described or clearly understandable STR reproducer: present This PR or issue contains code, which reproduce the problem described or clearly understandable STR server
Projects
None yet
Development

No branches or pull requests