[18.0][FIX] fastapi: Forwardport 16.0 pullrequest 486 - Avoid zombie threads #499

lembregtse · 2025-02-25T06:25:50Z

This is a forward-port of #486.

I am aware that that PR is not yet finalized / approved, but we required the fix for version 18.0.

We have modified the caching system to be aligned with Odoo's new way of doing caches and refreshes.

We will keep this PR update with the downstream PR.

OCA-git-bot · 2025-02-25T06:25:55Z

Hi @lmignon,
some modules you are maintaining are being modified, check this out!

lmignon

Thank you @lembregtse for the forward port. Nevertheless, can you preserve the authorship of the initial changes in 16.0. To preserve the author of the initial change, simply make a cherry pick of the PR commit on branch 16.0 and then make your changes in a new commit. This will also make it easy to see what adaptations have been made between the two versions.
Out of curiosity, what problems did you encounter with the previous implementation and in what context?

Each time a fastapi app is created, a new event loop thread is created by the ASGIMiddleware. Unfortunately, every time the cache is cleared, a new app is created with a new even loop thread. This leads to an increase in the number of threads created to manage the asyncio event loop, even though many of them are no longer in use. To avoid this problem, the thread in charge of the event loop is now created only once per thread / process and the result is stored in the thread's local storage. If a new instance of an app needs to be created following a cache reset, this ensures that the same event loop is reused. refs OCA#484

This commit adds event loop lifecycle management to the FastAPI dispatcher. Before this commit, an event loop and the thread to run it were created each time a FastAPI app was created. The drawback of this approach is that when the app was destroyed (for example, when the cache of app was cleared), the event loop and the thread were not properly stopped, which could lead to memory leaks and zombie threads. This commit fixes this issue by creating a pool of event loops and threads that are shared among all FastAPI apps. On each call to a FastAPI app, a event loop is requested from the pool and is returned to the pool when the app is destroyed. At request time of an event loop, the pool try to reuse an existing event loop and if no event loop is available, a new event loop is created. The cache of the FastAPI app is also refactored to use it's own mechanism. It's now based on a dictionary of queues by root path by database, where each queue is a pool of FastAPI app. This allows a better management of the invalidation of the cache. It's now possible to invalidate the cache of FastAPI app by root path without affecting the cache of others root paths.

On server shutdown, ensure that created the event loops are closed properly.

defaultdict in python is not thread safe. Since this data structure is used to store the cache of FastAPI apps, we must ensure that the access to this cache is thread safe. This is done by using a lock to protect the access to the cache.

This commit improves the lifecycle of the fastapi app cache. It first ensures that the cache is effectively invalidated when changes are made to the app configuration even if theses changes occur into an other server instance. It also remove the use of a locking mechanism put in place to ensure a thread safe access to a value into the cache to avoid potential concurrency issue when a default value is set to the cache at access time. This lock could lead to unnecessary contention and reduce the performance benefits of queue.Queue's fine-grained internal synchronization for a questionable gain. The only expected gain was to avoid the useless creation of a queue.Queue instance that would never be used since at the time of puting the value into the cache we are sure that a value is already present into the dictionary.

lembregtse · 2025-02-25T07:51:26Z

@lmignon Ofcourse, my bad, done. We are running Odoo containerized with gunicorn and ran into an issue with a customer where we use FastAPI for ETL data imports, where on certain objects/calls where the cache was refreshed zombie threads were being created.

Considering we have a quite strict thread spawning limitation in those containers, after about 60 to 70 calls, the threads got exhausted and the FastAPI app would no longer accept new connections until the worker was recycled. While debuggin the issues we came across the PR for version 16.0.

lmignon · 2025-02-25T08:45:20Z

@lmignon Ofcourse, my bad, done. We are running Odoo containerized with gunicorn and ran into an issue with a customer where we use FastAPI for ETL data imports, where on certain objects/calls where the cache was refreshed zombie threads were being created.

Thank you for the explanation and your changes. Indeed this PR should solve your issue. We weren't affected by this one at acsone, because our instances always run in multi-worker mode and not in multi-thread mode. Solving this problem was an opportunity to improve the management of the eventpool and cache, which even if they weren't problematic in our case, were still not optimal. Still out of curiosity, what are the motivations for preferring gunicorn to serve Odoo instead of Odoo's multiworker runner? Don't you lose the mechanisms for managing the memory per worker and the maximum process execution time?
Kind regards.

lmignon

LGTM (Code review only)

fastapi/pools/fastapi_app.py

veryberry · 2025-03-17T09:56:22Z

fastapi/pools/fastapi_app.py

+        if root_path:
+            self._queue_by_db_by_root_path[db_name][root_path] = queue.Queue()
+        elif db_name in self._queue_by_db_by_root_path:
+            del self._queue_by_db_by_root_path[db_name]


@lmignon Why do you think it is sufficient to delete content of _queue_by_db_by_root_path (ASGIMiddleware) this way? Threads created before supposed to be cleaned by Python GC? I've done some test with Odoo cache invalidation and logging current threads number, the number is incremented by 1 after every "Caches invalidated" signalling. The threads remain in the process.

the difference between 2 logs is 1 Odoo cache invalidation.

@lmignon I think I've found the issue. seems like you should have redefined "init" in ASGIMiddleware
(https://github.com/abersheeran/a2wsgi/blob/master/a2wsgi/asgi.py#L130)
because every time new loop is created

@veryberry The middleware used is defined here and extend https://github.com/abersheeran/a2wsgi/blob/master/a2wsgi/asgi.py#L130. The loop used comes from a poo and if one is available into the pool it will be reused.

@lmignon the problem is that you have extended its __call__ method
But actually the __init__ is triggered first, and there loop is alway None, and new loop is created every time you call return ASGIMiddleware(app).
I've proved it locally controlling threads after every Odoo cache invalidation.

see https://github.com/OCA/rest-framework/pull/508/files

Thanks
@lmignon just wanted you to be aware of this issue and know your opinion on it. In my case I've moved context manager usage from __call__ to __init__. It helped.

In my case I've moved context manager usage from __call__ to __init__. It helped.

????

without seeing the code, I find it hard to imagine that it would work correctly with the change you describes. It's important that the eventloop theard can only be used for one call at a time. That's why it's the call method that's overloaded, because once the context manager exit, the pool becomes available again for another odoo process/thread.

@lmignon would you be so kind to explain why it's important that the eventloop theard can only be used for one call at a time?

Think of the event loop thread like a single train track. If two trains (calls) try to use the same track at the same time, they’ll collide, and everything will break. To keep things running smoothly, we let one train (call) complete its journey before allowing another one on the track.

fastapi/models/fastapi_endpoint.py

lmignon · 2025-03-19T17:07:32Z

@lembregtse Do you plan to finalize your great work on this PR?

lembregtse · 2025-03-19T18:28:59Z

I have been sick the last week, I’ll apply the requested changes when I’m up and running! Sent from Outlook for iOS<https://aka.ms/o0ukef>

…

________________________________ From: Laurent Mignon (ACSONE) ***@***.***> Sent: Wednesday, March 19, 2025 6:07:53 PM To: OCA/rest-framework ***@***.***> Cc: Eric Lembregts ***@***.***>; Mention ***@***.***> Subject: Re: [OCA/rest-framework] [FIX] fastapi: Forwardport 16.0 pullrequest 486 - Avoid zombie threads (PR #499) @lembregtse<https://github.com/lembregtse> Do you plan to finalize your great work on this PR? — Reply to this email directly, view it on GitHub<#499 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAG2ZFCHQCX2PZ34ZTWBGLT2VGP6TAVCNFSM6AAAAABXZZK47CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOMZXGQZDONJVGI>. You are receiving this because you were mentioned.Message ID: ***@***.***> [lmignon]lmignon left a comment (OCA/rest-framework#499)<#499 (comment)> @lembregtse<https://github.com/lembregtse> Do you plan to finalize your great work on this PR? — Reply to this email directly, view it on GitHub<#499 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAG2ZFCHQCX2PZ34ZTWBGLT2VGP6TAVCNFSM6AAAAABXZZK47CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOMZXGQZDONJVGI>. You are receiving this because you were mentioned.Message ID: ***@***.***>

lmignon · 2025-03-20T12:55:49Z

I have been sick the last week, I’ll apply the requested changes when I’m up and running! Sent from Outlook for iOShttps://aka.ms/o0ukef

Thank you @lembregtse

lmignon · 2025-05-14T09:19:49Z

@lembregtse no news on this one ?

lembregtse · 2025-05-14T09:21:29Z

On it right now! Thanks for the reminder ;-)

…linting [FIX] fastapi: Apply linting recommendations in 18 [FIX] fastapi: Apply feedback on PR

lembregtse · 2025-05-14T09:36:30Z

@lmignon I've applied the first and last change requests. I'm not sure if anything needs to be changed based on your discussion with @veryberry. Let me know if anything else is required, thanks!

lmignon · 2025-05-14T14:12:28Z

@lembregtse Thank you for your work. Could you also include the commit from #508 This one is also required to avoid zombie threads...

lmignon · 2025-06-04T10:02:14Z

thank you for your hard work. I finalized your work in #532 by including the missing commit.

lembregtse force-pushed the 18.0-fastapi-fwp-16.0-event-loop-lifecycle branch from 440b332 to fea0cef Compare February 25, 2025 06:30

lmignon requested changes Feb 25, 2025

View reviewed changes

lmignon added 5 commits February 25, 2025 08:44

[FIX] fastapi: Graceful shutdown of event loop

1f7d17e

On server shutdown, ensure that created the event loops are closed properly.

lembregtse force-pushed the 18.0-fastapi-fwp-16.0-event-loop-lifecycle branch from fea0cef to f031284 Compare February 25, 2025 07:48

lembregtse force-pushed the 18.0-fastapi-fwp-16.0-event-loop-lifecycle branch from f031284 to 56a6d4a Compare February 25, 2025 07:52

lmignon approved these changes Feb 25, 2025

View reviewed changes

lmignon reviewed Mar 14, 2025

View reviewed changes

fastapi/pools/fastapi_app.py Outdated Show resolved Hide resolved

lmignon reviewed Mar 14, 2025

View reviewed changes

fastapi/pools/fastapi_app.py Outdated Show resolved Hide resolved

veryberry reviewed Mar 17, 2025

View reviewed changes

lmignon mentioned this pull request Mar 19, 2025

Error on Updating FastAPI enpoint #510

Open

lmignon requested changes Mar 19, 2025

View reviewed changes

fastapi/models/fastapi_endpoint.py Outdated Show resolved Hide resolved

lmignon changed the title ~~[FIX] fastapi: Forwardport 16.0 pullrequest 486 - Avoid zombie threads~~ [18.0][FIX] fastapi: Forwardport 16.0 pullrequest 486 - Avoid zombie threads Apr 10, 2025

[MIG] fastapi: Add support for multiple Odoo cache sequences and fix …

9cde646

…linting [FIX] fastapi: Apply linting recommendations in 18 [FIX] fastapi: Apply feedback on PR

lembregtse force-pushed the 18.0-fastapi-fwp-16.0-event-loop-lifecycle branch from 56a6d4a to 9cde646 Compare May 14, 2025 09:35

lembregtse requested a review from lmignon May 14, 2025 09:39

sebastienbeau added this to the 18.0 milestone Jun 3, 2025

lmignon mentioned this pull request Jun 4, 2025

[18.0][FIX] fastapi: Forwardport 16.0 pullrequest 486 - Avoid zombie threads #532

Merged

OCA-git-bot merged commit 9cde646 into OCA:18.0 Jun 4, 2025
6 of 7 checks passed

Uh oh!

[18.0][FIX] fastapi: Forwardport 16.0 pullrequest 486 - Avoid zombie threads #499

[18.0][FIX] fastapi: Forwardport 16.0 pullrequest 486 - Avoid zombie threads #499

Uh oh!

Conversation

lembregtse commented Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

OCA-git-bot commented Feb 25, 2025

Uh oh!

lmignon left a comment

Choose a reason for hiding this comment

Uh oh!

lembregtse commented Feb 25, 2025

Uh oh!

lmignon commented Feb 25, 2025

Uh oh!

lmignon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

veryberry Mar 17, 2025

Choose a reason for hiding this comment

Uh oh!

veryberry Mar 17, 2025

Choose a reason for hiding this comment

Uh oh!

veryberry Mar 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lmignon Mar 17, 2025

Choose a reason for hiding this comment

Uh oh!

veryberry Mar 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lmignon Mar 17, 2025

Choose a reason for hiding this comment

Uh oh!

veryberry Mar 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lmignon Mar 17, 2025

Choose a reason for hiding this comment

Uh oh!

veryberry Mar 17, 2025

Choose a reason for hiding this comment

Uh oh!

lmignon Mar 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lmignon commented Mar 19, 2025

Uh oh!

lembregtse commented Mar 19, 2025 via email

Uh oh!

lmignon commented Mar 20, 2025

Uh oh!

lmignon commented May 14, 2025

Uh oh!

lembregtse commented May 14, 2025

Uh oh!

lembregtse commented May 14, 2025

Uh oh!

lmignon commented May 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

lmignon commented Jun 4, 2025

Uh oh!

Uh oh!

lembregtse commented Feb 25, 2025 •

edited

Loading

veryberry Mar 17, 2025 •

edited

Loading

veryberry Mar 17, 2025 •

edited

Loading

veryberry Mar 17, 2025 •

edited

Loading

lmignon commented May 14, 2025 •

edited

Loading