Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Geospatial Celery Queue #6430

Merged
merged 3 commits into from
Nov 6, 2024
Merged

Conversation

zandre-eng
Copy link
Contributor

Jira ticket here.

A new Celery queue for the Geospatial feature has been added. Currently, there is only one task planned to be run on this queue, which is related to this PR. This task was originally set to run as a serial task on the background_queue, however it takes very long to process a lot of cases. Given how long it takes, and that it is a serial task, any domains that are queued up for the task are highly likely to fail from getting locked out of the task for too long.

Adding a new Celery queue will resolve the issue mentioned above by allowing the Geospatial Celery task to be run in parallel without the risk of blocking up something like the background_queue. A concurrency of 1 has been chosen since we do not expect more than a few of the Celery tasks to be fired off at any given time. Additionally, this is a task that we expect to be run quite infrequently since it is only run when the Geospatial feature is enabled for a domain.

Environments Affected

This touches all environments.

Copy link
Contributor

@mkangia mkangia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

I appreciate the thoughtful clubbing of queue for different environments.

Good to wait for a go ahead from @gherceg

src/commcare_cloud/environment/schemas/app_processes.py Outdated Show resolved Hide resolved
Copy link
Contributor

@gherceg gherceg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good way to artificially limit the load these migrations can incur on the system at any given time. I do still wonder if it is worth chunking up the celery tasks that run on this queue to avoid long running tasks since that is still a general good practice. Though I recognize that is not in the scope of this PR.

@zandre-eng
Copy link
Contributor Author

I do still wonder if it is worth chunking up the celery tasks that run on this queue to avoid long running tasks since that is still a general good practice.

I completely agree here. I left this change out for now as it will increase the complexity of the current solution which I agree is a bit out of scope for this PR. But this is definitely a great next improvement to make in followup work!

@zandre-eng zandre-eng merged commit 273e7d5 into master Nov 6, 2024
3 checks passed
@zandre-eng zandre-eng deleted the ze/add-geospatial-celery-queue branch November 6, 2024 08:02
@zandre-eng zandre-eng mentioned this pull request Nov 8, 2024
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants