You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summary: I found a solution to run only periodic tasks (Beat) which are present in Django Project. All renamed or removed tasks will be deactivated in DB used by Beat.
We got trouble with tasks triggered by Beat. After changing the name of a task the 'old' task entry in DB also was running. Even with removing the task definitions from code all already activated periodic tasks were running (no adjustment with current tasks present in Django Project).
One workaround was to delete all entries used by Beat in DB between shut down and start up of Beat. But I though there must be a better way to solve this.
Environment / Packages
Celery Version: 5.4.0
Celery-Beat Version: 2.6.0
Django-Version: 5.0.6
Python-Version: 3.12.0
OS: Linux (Ubuntu) - but same behave on Windows Platform.
Exact steps to reproduce the issue:
Already initialized Django Project with Celery (celery[redis]).
add 'django-celery-results' to the project.
add 'django-celery-beat' to the project.
add both to the installed apps in settings.
in settings: CELERY_BEAT_SCHEDULER = 'django_celery_beat.schedulers:DatabaseScheduler' to use the DB as storage for Beat.
migrate.
other settings as described in Celery Documentation.
create tasks.
run worker.
run beat.
stops all.
change the names of the tasks.
start worker and beat again.
'old' task entries and 'new' (changed) task entries are running.
Detailed information
I debugged with 'inspect' to understand the flow when Beat is running.
After some minutes I found the solution (... I am joking - after many hours).
My solution is to disable all periodic tasks in DB used by Beat.
After this I enable each task which was found in the Django project.
In DB I have enabled (and currently present tasks from Project) and also
deactivated not running tasks.
I made short tests - in future I will go deeper.
After using CELERY_BEAT_SCHEDULER = 'celery.beat:PersistentScheduler' I recognized the same wrong behave. But I do not use file based storage (maybe I fix this in future - should be the same way).
Here is my change in the original django_celery_beat.schedulers.DatabaseScheduler after debugging (for a useful solution step to the next point):
classDatabaseScheduler(Scheduler):
.
.
.
defsetup_schedule(self):
self.disable_all_periodic_tasks() # newself.install_default_entries(self.schedule)
self.update_from_dict(self.app.conf.beat_schedule)
defdisable_all_periodic_tasks(self): # new method"""disable all periodic tasks to be sure that tasks removed from Django code are deactivated in DB after Beat restart. But disabled tasks are still present and reachable in DB (not deleted)."""current_periodic_tasks=PeriodicTask.objects.all()
forperiodic_taskincurrent_periodic_tasks:
periodic_task.enabled=Falseperiodic_task.save()
.
.
.
defupdate_from_dict(self, mapping):
s= {}
forname, entry_fieldsinmapping.items():
print(f"{name=}{entry_fields=}")
# enable each task found in code of Django.# one exception: "enabled" key is already present # in 'entry_fields' - this means an explicit setting.if"enabled"notinentry_fields: # newentry_fields["enabled"] =True# newtry:
entry=self.Entry.from_entry(name,
app=self.app,
**entry_fields)
ifentry.model.enabled:
s[name] =entryexceptExceptionasexc:
logger.exception(ADD_ENTRY_ERROR, name, exc, entry_fields)
self.schedule.update(s)
.
.
.
Useful Solution
I inherited from the original DatabaseScheduler and overwrote two methods (from above) and added the third. After this I changed my scheduler to use in settings.py to CELERY_BEAT_SCHEDULER = 'myapp.own_scheduler:OwnScheduler'.
project/myapp/own_scheduler.py
fromdjango_celery_beat.schedulersimport (
DatabaseScheduler,
ADD_ENTRY_ERROR,
PeriodicTask,
logger,
)
classOwnScheduler(DatabaseScheduler):
defsetup_schedule(self):
self.disable_all_periodic_tasks()
self.install_default_entries(self.schedule)
self.update_from_dict(self.app.conf.beat_schedule)
defdisable_all_periodic_tasks(self):
"""disable all periodic tasks to be sure that tasks removed from Django code are deactivated in DB after Beat restart. But disabled tasks are still present and reachable in DB (not deleted)."""current_periodic_tasks=PeriodicTask.objects.all()
forperiodic_taskincurrent_periodic_tasks:
periodic_task.enabled=Falseperiodic_task.save()
defupdate_from_dict(self, mapping):
s= {}
forname, entry_fieldsinmapping.items():
# enable each task found in code of Django.# one exception: "enabled" key is already present # in 'entry_fields' - this means an explicit setting.if"enabled"notinentry_fields:
entry_fields["enabled"] =Truetry:
entry=self.Entry.from_entry(name,
app=self.app,
**entry_fields)
ifentry.model.enabled:
s[name] =entryexceptExceptionasexc:
logger.exception(ADD_ENTRY_ERROR, name, exc, entry_fields)
self.schedule.update(s)
Have fun. mrkf
The text was updated successfully, but these errors were encountered:
This is really basic functionality -- if the records in django_celery_beat_periodictask (the DB) don't match what's in beat_schedule (application code), this library is broken =(
On a more productive note, a straightforward way to fix this would be to delete and recreate all the schedule records on startup. For syncing state, delete/recreate is in general slower but easier than diffing. For most use cases, with e.g. ~5-50 scheduled tasks, it won't seriously affect startup time, especially if records are batch inserted
This could be configurable behavior. But yeah, we just ran into this same bug -- we removed a task from the beat schedule, redeployed and restarted beat. Beat kept running the (non-existent) task, raising NotRegistered every time
My interesting was to understand the inner structure of Celery + Beat and the question: did they really forget to implement a way to deactivate (or remove) periodic tasks after a change in code?
The second point was (and is): I want to hold my entries in DB for analysis (also triggered by a desire of my mentor).
And this is the different. I think it is case dependent.
Summary: I found a solution to run only periodic tasks (Beat) which are present in Django Project. All renamed or removed tasks will be deactivated in DB used by Beat.
We got trouble with tasks triggered by Beat. After changing the name of a task the 'old' task entry in DB also was running. Even with removing the task definitions from code all already activated periodic tasks were running (no adjustment with current tasks present in Django Project).
One workaround was to delete all entries used by Beat in DB between shut down and start up of Beat. But I though there must be a better way to solve this.
Environment / Packages
Exact steps to reproduce the issue:
celery[redis]
).Detailed information
I debugged with 'inspect' to understand the flow when Beat is running.
After some minutes I found the solution (... I am joking - after many hours).
My solution is to disable all periodic tasks in DB used by Beat.
After this I enable each task which was found in the Django project.
In DB I have enabled (and currently present tasks from Project) and also
deactivated not running tasks.
I made short tests - in future I will go deeper.
After using CELERY_BEAT_SCHEDULER = 'celery.beat:PersistentScheduler' I recognized the same wrong behave. But I do not use file based storage (maybe I fix this in future - should be the same way).
Here is my change in the original
django_celery_beat.schedulers.DatabaseScheduler
after debugging (for a useful solution step to the next point):lib/python3.12/site-packages/django_celery_beat/schedulers.py
Useful Solution
I inherited from the original
DatabaseScheduler
and overwrote two methods (from above) and added the third. After this I changed my scheduler to use insettings.py
toCELERY_BEAT_SCHEDULER = 'myapp.own_scheduler:OwnScheduler'
.project/myapp/own_scheduler.py
Have fun. mrkf
The text was updated successfully, but these errors were encountered: