Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Queue retention #16

Closed
RealOrangeOne opened this issue Jun 4, 2024 · 5 comments · Fixed by #60
Closed

Queue retention #16

RealOrangeOne opened this issue Jun 4, 2024 · 5 comments · Fixed by #60
Assignees
Labels
database-backend Issues relating to the database backend
Milestone

Comments

@RealOrangeOne
Copy link
Owner

The task queue doesn't have the ability to remove old entries. This means the table can grow considerably, impacting performance.

"NEW" and "PROCESSING" tasks should never be removed! I suspect retention for complete and failed tasks should be configurable separately (even if failed inherits completed by default).

@RealOrangeOne RealOrangeOne added this to the 1.0 milestone Jun 4, 2024
@jribbens
Copy link

jribbens commented Jun 5, 2024

Removing expired entries from a table is a classic case of something that requires a background worker. Possibly you could look into seeing if there is some sort of background worker system available for Django to carry out this task?

@mgax
Copy link
Contributor

mgax commented Jul 18, 2024

I think an easy solution is to create a management command that removes completed tasks older than a certain interval, sort of like Django's clearsessions command.

./manage.py clear_db_tasks --age=30d

Without an age argument, it might remove all completed tasks.

@RealOrangeOne
Copy link
Owner Author

My current draft implementation does exactly that: #60.

There is however a part of me which wants to add it to the existing worker process. That way, it becomes easier to run other maintenance tasks in that process too. Otherwise clearing, retries, and other maintenance might be other commands. It'd also keep configuration simpler, as it could live in settings.

@mgax
Copy link
Contributor

mgax commented Jul 19, 2024

I guess the goal is to make things easier for new users and not have them run multiple commands? I think it would be more clear if db_worker did just one thing. It's not uncommon for people to set up their production containers to run migrations and then start the web worker, but it's something people decide for themselves, Django won't do that automatically.

A couple of concerns:

  • If multiple workers are running, only one should be doing maintenance tasks, otherwise you might end up with duplicated retries. So I guess you'd enable the extra functionality explicitly, e.g. ./manage.py db_worker --gc.
  • It should still be possible to run maintenance tasks without starting a worker (either because of a deployment decision to keep the maintenance workload separate, or during development, or when fixing an outage).

@RealOrangeOne
Copy link
Owner Author

It should still be possible to run maintenance tasks without starting a worker

This is a good point. I guess at the point that that's a requirement, it probably makes sense to just keep it an external command.

When there's cron (#8 ), perhaps we can add the ability to automatically run it as part of that, but otherwise best keep it simpler for now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
database-backend Issues relating to the database backend
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants