You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have noticed in the logs that there is a recurring GreenletExit exception. Although this does not cause an error for clients, it negatively impacts performance.
max_requests = 2 (set low to demonstrate the problem quickly)
timeout = 60
worker_class = 'gevent'
workers = 1 (to highlight the problem more clearly)
graceful_timeout = 30 (default)
keepalive = 75
Note: The key point is that keepalive > graceful_timeout.
Client:
Although the server has keepalive set to 75, the client does not reuse connections and opens a new one each time.
The client sends requests in a loop, in parallel.
Observations:
When both the client and server are running, the server occasionally restarts the worker. During this restart, it stops accepting new connections and waits for the graceful period to pass, causing the client to wait 30 seconds before continuing processing. After the graceful period, a message appears in the logs:
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/gunicorn/workers/base_async.py", line 48, in handle
req = next(parser)
^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/gunicorn/http/parser.py", line 42, in __next__
self.mesg = self.mesg_class(self.cfg, self.unreader, self.source_addr, self.req_count)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/gunicorn/http/message.py", line 257, in __init__
super().__init__(cfg, unreader, peer_addr)
File "/usr/local/lib/python3.11/site-packages/gunicorn/http/message.py", line 60, in __init__
unused = self.parse(self.unreader)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/gunicorn/http/message.py", line 269, in parse
self.get_data(unreader, buf, stop=True)
File "/usr/local/lib/python3.11/site-packages/gunicorn/http/message.py", line 260, in get_data
data = unreader.read()
^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/gunicorn/http/unreader.py", line 37, in read
d = self.chunk()
^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/gunicorn/http/unreader.py", line 64, in chunk
return self.sock.recv(self.mxchunk)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/gevent/_socketcommon.py", line 666, in recv
self._wait(self._read_event)
File "src/gevent/_hub_primitives.py", line 317, in gevent._gevent_c_hub_primitives.wait_on_socket
File "src/gevent/_hub_primitives.py", line 322, in gevent._gevent_c_hub_primitives.wait_on_socket
File "src/gevent/_hub_primitives.py", line 304, in gevent._gevent_c_hub_primitives._primitive_wait
File "src/gevent/_hub_primitives.py", line 46, in gevent._gevent_c_hub_primitives.WaitOperationsGreenlet.wait
File "src/gevent/_hub_primitives.py", line 46, in gevent._gevent_c_hub_primitives.WaitOperationsGreenlet.wait
File "src/gevent/_hub_primitives.py", line 55, in gevent._gevent_c_hub_primitives.WaitOperationsGreenlet.wait
File "src/gevent/_waiter.py", line 154, in gevent._gevent_c_waiter.Waiter.get
File "src/gevent/_greenlet_primitives.py", line 61, in gevent._gevent_c_greenlet_primitives.SwitchOutGreenletWithLoop.switch
File "src/gevent/_greenlet_primitives.py", line 61, in gevent._gevent_c_greenlet_primitives.SwitchOutGreenletWithLoop.switch
File "src/gevent/_greenlet_primitives.py", line 65, in gevent._gevent_c_greenlet_primitives.SwitchOutGreenletWithLoop.switch
File "src/gevent/_gevent_c_greenlet_primitives.pxd", line 35, in gevent._gevent_c_greenlet_primitives._greenlet_switch
greenlet.GreenletExit
[2024-05-28 13:54:12 +0000] [17990] [INFO] Worker exiting (pid: 17990)
After this, a new worker is started and everything works again until the next restart.
Is this expected behavior? In my opinion, no. It doesn't make sense to wait for sock.recv if we want to restart the process and there is nothing to read from the socket. Once the restart begins, the socket should be shut down, but it still waits for read operations. This happens because the greenlets that started before the restart are inside the keepalive loop and are waiting. The server is stopped (see this line in ggevent.py), but the pending socket waiters are not shut down.
I believe there have been attempts to fix this, such as in commit 7896057 (reverted here: 8c5613b). I also proposed a fix that seemed to work: shutting down read events in the socket if worker.alive == False. However, after testing, it did not work as expected in all cases.
After further consideration, I am proposing a new change that seems to be low-risk but can significantly reduce disruptions in request handling.
MR with my proposal: #3236
The text was updated successfully, but these errors were encountered:
I have noticed in the logs that there is a recurring GreenletExit exception. Although this does not cause an error for clients, it negatively impacts performance.
Steps to reproduce:
Server configuration:
Gunicorn settings:
Note: The key point is that keepalive > graceful_timeout.
Client:
Observations:
When both the client and server are running, the server occasionally restarts the worker. During this restart, it stops accepting new connections and waits for the graceful period to pass, causing the client to wait 30 seconds before continuing processing. After the graceful period, a message appears in the logs:
After this, a new worker is started and everything works again until the next restart.
Is this expected behavior? In my opinion, no. It doesn't make sense to wait for sock.recv if we want to restart the process and there is nothing to read from the socket. Once the restart begins, the socket should be shut down, but it still waits for read operations. This happens because the greenlets that started before the restart are inside the keepalive loop and are waiting. The server is stopped (see this line in ggevent.py), but the pending socket waiters are not shut down.
I believe there have been attempts to fix this, such as in commit 7896057 (reverted here: 8c5613b). I also proposed a fix that seemed to work: shutting down read events in the socket if worker.alive == False. However, after testing, it did not work as expected in all cases.
After further consideration, I am proposing a new change that seems to be low-risk but can significantly reduce disruptions in request handling.
MR with my proposal: #3236
The text was updated successfully, but these errors were encountered: