-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only instantiate one celery app per process. #25822
Conversation
Ticket: BOM-2086 Currently there are parts of the LMS that import content from the CMS APP and vice-versa. When this happens, we end up with 2 instances of the celery app and some tasks get registered to the wrong one. The tasks that were getting registered to the wrong one are never able to run and result in lots of production errors on celery workers. The timing of the CMS celery app instantiation is non deterministic so different tasks get lost depending on when it's imported by some code in the LMS. As long as SERVICE_VARIANT is set, this code should prevent the instantiation of both celery apps.
Your PR has finished running tests. There were no failures. |
In #25822 we protect against the case where *any* `cms.**` module is imported by the running LMS process, causing cms.celery to be imported and a second celery to be instantiated (leading to tasks being registered to the wrong one and effectively lost). This commit protects against the more restricted case of `cms.celery` or `lms.celery` being imported explicitly, rather than implicitly by their respective root `__init__.py`.
EdX Release Notice: This PR has been deployed to the staging environment in preparation for a release to production. |
This change was reverted along with the downgrade of super-csv because it was believed that this change was causing issues with celery task registration. It was reverted in https://github.com/edx/edx-platform/pull/25762 The fix turned out to be something different and this and the super-csv library should both be same to upgrade. The library upgrade will come in a follow-on make upgrade with some other constraints that also need to be lifted. Actual fix: https://github.com/edx/edx-platform/pull/25822
This should no longer be necessary now that we have found a root cause fix for the issue. This was removed because it was causing the case of celery tasks not getting registered to happen more often. Undoing https://github.com/edx/edx-platform/pull/25787 Actual Fixes: - https://github.com/edx/edx-platform/pull/25822 - https://github.com/edx/edx-platform/pull/25825
EdX Release Notice: This PR has been deployed to the production environment. |
EdX Release Notice: This PR has been rolled back from the production environment. |
This should prevent the issues we've seen recently where cms modules are imported by the running lms process, resulting in two celery instances being created and tasks intermittently being registered to the wrong instance (and therefore effectively lost.) In commit ab6bf34/PR #25822 we tried to ensure that only one or the other of the instances was created by adding a startup check. Unfortunately, there's an external shared library that refers directly to the lms celery, causing a startup failure in cms, so we had to revert it. Rather than waiting to fix that library, this commit collapses the two instances together so that there is only ever one.
EdX Release Notice: This PR has been deployed to the production environment. |
…25840) This should prevent the issues we've seen recently where cms modules are imported by the running lms process, resulting in two celery instances being created and tasks intermittently being registered to the wrong instance (and therefore effectively lost.) In commit ab6bf34/PR #25822 we tried to ensure that only one or the other of the instances was created by adding a startup check. Unfortunately, there's an external shared library that refers directly to the lms celery, causing a startup failure in cms, so we had to revert it. Rather than waiting to fix that library, this commit collapses the two instances together so that there is only ever one.
This change was reverted along with the downgrade of super-csv because it was believed that this change was causing issues with celery task registration. It was reverted in https://github.com/edx/edx-platform/pull/25762 The fix turned out to be something different and this and the super-csv library should both be same to upgrade. The library upgrade will come in a follow-on make upgrade with some other constraints that also need to be lifted. Actual fix: https://github.com/edx/edx-platform/pull/25822
This should no longer be necessary now that we have found a root cause fix for the issue. This was removed because it was causing the case of celery tasks not getting registered to happen more often. Undoing https://github.com/edx/edx-platform/pull/25787 Actual Fixes: - https://github.com/edx/edx-platform/pull/25822 - https://github.com/edx/edx-platform/pull/25825
…25840) This should prevent the issues we've seen recently where cms modules are imported by the running lms process, resulting in two celery instances being created and tasks intermittently being registered to the wrong instance (and therefore effectively lost.) In commit ab6bf34/PR #25822 we tried to ensure that only one or the other of the instances was created by adding a startup check. Unfortunately, there's an external shared library that refers directly to the lms celery, causing a startup failure in cms, so we had to revert it. Rather than waiting to fix that library, this commit collapses the two instances together so that there is only ever one. (cherry picked from commit 0c57a02)
Ticket: BOM-2086
Currently there are parts of the LMS that import content from the CMS
APP and vice-versa. When this happens, we end up with 2 instances of
the celery app and some tasks get registered to the wrong one. The
tasks that were getting registered to the wrong one are never able to
run and result in lots of production errors on celery workers.
The timing of the CMS celery app instantiation is non deterministic
so different tasks get lost depending on when it's imported by some
code in the LMS.
As long as SERVICE_VARIANT is set, this code should prevent the
instantiation of both celery apps.