Celery Beat does not dispatch tasks properly after a process restart when `TIME_ZONE` setting is not `UTC` and `USE_TZ=False` #798

wencakisa · 2024-08-28T08:58:31Z

Summary:

We've had a production issue where whenever we deploy new code (which ultimately leads to restart of the processes, including the beat) - the scheduled periodic tasks do not dispatch for exactly 1 hour after this restart. After that - they begin as scheduled and we have no delays after that (until the next deploy, unfortunately).

Celery Version: 5.4.0
Celery-Beat Version: 2.7.0

Exact steps to reproduce the issue:

Set USE_TZ=False in your Django settings
Change the time zone configuration to use Europe/London

Detailed information

We found the exact root cause of this and it is a complex combination of:

The Django timezone settings
The last_run_at field of the PeriodicTask model
The Celery code that determines whether the task "is before the last run"

So, we have the following Django settings:

USE_TZ = False
TIME_ZONE = "Europe/London"
CELERY_TIME_ZONE = TIME_ZONE  # "Europe/London"

This leads to the following:

Datetime objects that are passed within the Django app are timezone-naive
The datetime objects are stored in the DB in the London timezone

Something to note here - London & UTC are even, but due to DST - they now have a one hour difference:

I have a periodic task that runs every minute. If I start the task for the first time - everything goes as expected and the task is dispatch every minute.

However, if I kill the beat process and run it again - the task is not dispatched until exactly 1 hour and 1 minute after that.

We found that this issue is because the last_run_at field in the PeriodicTask objects is saved as timezone-naive (which is expected, because USE_TZ is set to False) - the beat process does not properly convert it to a London timezone when checking the last run time, rather than converting it to UTC:

Here's the exact code that does that (https://github.com/celery/celery/blob/f3a2cf45a69b443cac6c79a5c85583c8bd91b0a3/celery/schedules.py#L470-L473):

def is_before_last_run(year, month, day):
    return self.maybe_make_aware(datetime(year, month, day)) < last_run_at

where maybe_make_aware makes the passed datetime object timezone-aware, but defaults to UTC, rather than the specified timezone, which leads to the issue (https://github.com/celery/celery/blob/f3a2cf45a69b443cac6c79a5c85583c8bd91b0a3/celery/utils/time.py#L308):

def maybe_make_aware(dt, tz=None):
    """Convert dt to aware datetime, do nothing if dt is already aware."""
    if is_naive(dt):
        dt = to_utc(dt)
        return localize(
            dt, timezone.utc if tz is None else timezone.tz_or_local(tz),
        )
    return dt

This is most probably the root cause for all of these:

The same issue represents itself in the opposite way when you use a timezone that is "before" UTC, for example - America/New_York (which is currently 4 hours before UTC).

If you do that - the task is dispatched immediately after the process starts, no matter that you said it should be run every minute. Which makes sense, because the same comparison functions are executed, but only aimed toward UTC.

The fix we found for our case is to reset last_run_at to None each time we do a new deployment - this way, there is no past datetime to compare with, thus the tasks begin execution as normal. After that, further scheduled executions are correct.

PeriodicTask.objects.update(last_run_at=None)

Which is actually what the documentation suggests if you do timezone configuration changes. But in our case - we did not change the configuration at all, it is the same from the start. We need to do this in order to "fake" the Beat process that this task has never run before, ultimately making the tasks dispatch as expected.

If you use USE_TZ=True and TIME_ZONE="UTC" - you won't have this issue.
However, changing our settings to these ☝🏻 default values is impossible at this moment, thus I think this should be carefully thought and possibly issue a fix.

The fix itself should be relatively easy - when comparing datetime with last_run_at, observe the configured timezone and make the passed object aware to the relevant timezone, not strictly UTC.

The text was updated successfully, but these errors were encountered:

ChanXing2023 · 2024-09-07T13:50:03Z

USE_TZ=True

ThankCat · 2024-09-08T01:05:02Z

I don't understand why developers ignore such questions

Nusnus · 2024-09-08T05:56:43Z

I don't understand why developers ignore such questions

We do not ignore anything my friend.
We just have other priorities.

Contributing a possible solution is a wonderful method to get more attention/prioritization.

ChanXing2023 · 2024-09-10T03:36:10Z

You can learn from this
#801

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Celery Beat does not dispatch tasks properly after a process restart when `TIME_ZONE` setting is not `UTC` and `USE_TZ=False` #798

Celery Beat does not dispatch tasks properly after a process restart when `TIME_ZONE` setting is not `UTC` and `USE_TZ=False` #798

wencakisa commented Aug 28, 2024 •

edited

Loading

ChanXing2023 commented Sep 7, 2024

ThankCat commented Sep 8, 2024

Nusnus commented Sep 8, 2024

ChanXing2023 commented Sep 10, 2024

Celery Beat does not dispatch tasks properly after a process restart when TIME_ZONE setting is not UTC and USE_TZ=False #798

Celery Beat does not dispatch tasks properly after a process restart when TIME_ZONE setting is not UTC and USE_TZ=False #798

Comments

wencakisa commented Aug 28, 2024 • edited Loading

Summary:

Exact steps to reproduce the issue:

Detailed information

ChanXing2023 commented Sep 7, 2024

ThankCat commented Sep 8, 2024

Nusnus commented Sep 8, 2024

ChanXing2023 commented Sep 10, 2024

Celery Beat does not dispatch tasks properly after a process restart when `TIME_ZONE` setting is not `UTC` and `USE_TZ=False` #798

Celery Beat does not dispatch tasks properly after a process restart when `TIME_ZONE` setting is not `UTC` and `USE_TZ=False` #798

wencakisa commented Aug 28, 2024 •

edited

Loading