Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cron scheduled methods may be executed a fraction of a second too early #29735

Closed
thuri opened this issue Dec 23, 2022 · 16 comments
Closed

Cron scheduled methods may be executed a fraction of a second too early #29735

thuri opened this issue Dec 23, 2022 · 16 comments
Labels
status: declined A suggestion or change that we don't feel we should currently apply

Comments

@thuri
Copy link

thuri commented Dec 23, 2022

Affects: 6.0.3


Hi Spring Team,

i was just updating a project of mine from spring-boot 2.7.6 to spring-boot 3.0.0 and may have found an issue with cron scheduled beans. I'm quite sure that this has something to do with spring-context and not Spring Boot itself, so i decided to create this issue here.

When the application ran with 2.7.6 the cron expression "0 */1 * * * *" executed the annotated method always some faction of a second AFTER second 0 of each minute.

With 3.0.1 the job may be executed "around" the second 0 of each minute. That means, that a job may be executed at second 59 + some fraction of a second.

See the following log output:

2022-12-23T14:05:59.999+01:00 ERROR 5262 --- [   scheduling-1] n.l.spring6.scheduled.cron.CronJob       : now=2022-12-23T13:05:59.999691839Z now.truncated=2022-12-23T13:05:59Z

In most cases that may not be a problem but in my project i use the current time (java.time.Instant.now) to calculate the next iteration with CronExpression#nextwhich will result in the timestamp of the next minute where the second is 0. This means that the result of next is actually the timestamp on which the current execution should have been started.

I did create a small demo project to show you what I mean and to reproduce the problem if possible: https://github.com/thuri/spring6-scheduled-cron

I had a look at CronExpression source code but didn't find a change that could be the cause of this. Maybe I'll have some time to have a deeper look into this.

Regards,
Michael

@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged or decided on label Dec 23, 2022
@simonbasle simonbasle self-assigned this Jan 3, 2023
@simonbasle
Copy link
Contributor

Hi @thuri, I've tried to debug and find the root cause of this issue but without success so far.
It is not as trivially reproduced as I first thought from that repro project.
I've tried to set up a different cron expression (*/5 * * * * *) for a higher chance at reproducing, but again to little success :(

I've also tried to run a test that uses CronTrigger with your cron expression and a randomized start time (seconds and milliseconds) to see if the nextExecutionTime would somehow be inconsistent/too early but it seems to be doing the right thing, so it is unlikely the CronTrigger is the culprit.

I've suspected at some point that the part of org.springframework.scheduling.concurrent.ReschedulingRunnable#schedule which computes an initialDelay Duration was computing a negative duration but haven't been able to observe that.

Just had 1 error with the following logs and the original cron expression from your CronJob, which seems to rather disprove this theory:

59975ms aka 2023-01-06T16:34:00.003298Z vs scheduledExecutionTime = 2023-01-06T16:34:00Z
2023-01-06T17:33:59.994+01:00 ERROR 35409 --- [   scheduling-1] com.example.sprtest.CronJob              : Unexpected seconds 1673022839, now=2023-01-06T16:33:59.989957Z now.truncated=2023-01-06T16:33:59Z

What we see above are a debugger log from the ReschedulingRunnable#schedule method and a second line of classic log from the CronJob.

So: the runnable does compute a delay of 59975ms, which should result in an execution time at 16:34:00.003298, which is congruent with the CronTrigger-computed scheduledExecutionTime of 16:34:00.0. In a nutshell, the ReschedulingRunnable seems to be doing a good job.

So all we're left with is the executor.schedule(...) call 🤔
I wonder if something in Java 17 schedulers has changed that could lead to these "fraction of a second too early" cases...

@thuri
Copy link
Author

thuri commented Jan 7, 2023

Hi @simonbasle , thank you for having a look into this.
Not sure how this would be possible but maybe it's an issue on my machine and setup

  • ubuntu desktop 22.04.1,
  • Core i3
  • OpenJDK 64-Bit Server VM (build 17.0.5+8-Ubuntu-2ubuntu122.04, mixed mode, sharing)
    I'll try another system as soon as possible whether it can be reproduced

But what stands against that is, that the code works when downgraded to spring boot 2.7.6 and run with the same setup (see README in example project). So I'm not sure whether this could be a Java 17 issue.

@thuri
Copy link
Author

thuri commented Jan 8, 2023

I ran the example app on another system and had the same results.

Ubuntu 20.04.5 LTS
OpenJDK 64-Bit Server VM (build 17.0.5+8-Ubuntu-2ubuntu120.04, mixed mode, sharing)

I also switched to a Temerium JVM also with the same issue
OpenJDK 64-Bit Server VM Temurin-17.0.5+8 (build 17.0.5+8, mixed mode, sharing)

@simonbasle
Copy link
Contributor

simonbasle commented Jan 9, 2023

Clock drifting / rolling back is still the prime suspect in my opinion. Instant.now() doesn't use a monotonic clock, but deals with wall clock time (System.currentTimeMillis()). I don't know why it occurs more prominently under Boot 3, but that's the best explanation I have given the fact that the rare times where I observed the CronJob code reporting an "early" Instant.now(), both the CronTrigger-generated Instant and the code in ReschedulingRunnable#schedule where generating congruent Instant at or after the desired instant. i.e. the Spring code was computing a delay in milliseconds after which to execute the CronJob that was correct, but the clock reading in the CronJob would see an earlier "now".

One way to definitely validate this hypothesis is for you to simplify the code as much as possible, eliminating all moving pieces (including Spring) and try to emulate the code in ReschedulingRunnable#schedule to see if a simple executor.schedule can also see an early time, in your environment. Note that it can only validate the hypothesis (we cannot draw any conclusion if the issue doesn't occur, as it doesn't prove it cannot occur).

Nevertheless, I would advise you to try and avoid recomputing your own "next time" in the production code. It seems you're replicating the job of Scheduled/CronTrigger and that clock inconsistency is the kind of trouble that can arise from doing that.
If you can change your production code to use a computed sequence number or use a monotonic "instant" (see System.nanoTime()) that would be ideal... But I understand that your case is of course more complicated than your current reproducer.

(edit: clarify I'm asking the executor.schedule experiment/second reproducer to occur in your environment)

@simonbasle simonbasle added status: waiting-for-feedback We need additional information before we can continue and removed status: waiting-for-triage An issue we've not yet triaged or decided on labels Jan 9, 2023
@simonbasle
Copy link
Contributor

simonbasle commented Jan 10, 2023

this is the test I came up with that reproduces the code ultimately executed in ReschedulingRunnable:

public static void main(String[] args) throws ExecutionException, InterruptedException {
		CronTrigger trigger = new CronTrigger("*/5 * * * * * ");
		ScheduledExecutorService executor = Executors.newSingleThreadScheduledExecutor();
		SimpleTriggerContext triggerContext = new SimpleTriggerContext(Clock.systemUTC());
		final List<Instant> instants = new ArrayList<>();
		Callable<Instant> captureNow = () -> {
			Instant now = triggerContext.getClock().instant();
			instants.add(now);
			return now;
		};

		for (int i = 0; i < 50; i++) {

			Instant scheduledExecutionTime = trigger.nextExecution(triggerContext);
			if (scheduledExecutionTime == null) {
				throw new NullPointerException("stopped by null");
			}

			Instant delayBaseNow = triggerContext.getClock().instant();
			Duration delayDuration = Duration.between(delayBaseNow, scheduledExecutionTime);
			System.out.printf("#%d Scheduling cron job at %s\n", i, scheduledExecutionTime);

			if (delayDuration.isNegative()) {
				throw new IllegalStateException("between now and scheduledExecutionTime, duration is negative (" + delayDuration + ")\n");
			}
			else {
				System.out.printf("#%d Expected to run job at %s (duration of %s)\n\n", i, delayBaseNow.plus(delayDuration), delayDuration.toMillis());
			}
			long initialDelay = delayDuration.toMillis();

			var currentFuture = executor.schedule(captureNow, initialDelay, TimeUnit.MILLISECONDS);
			Instant executionTime = currentFuture.get();

			triggerContext.update(scheduledExecutionTime, executionTime, executionTime);

			if (executionTime.isBefore(scheduledExecutionTime)) {
				throw new IllegalStateException("Clock drift in round " + i + " at " + executionTime + " which was scheduled at " + scheduledExecutionTime);
			}
		}

		System.exit(0);
	}

Again, on my machine, it doesn't fail that often. But I just had a failure which proves that the issue occurs at lower levels than Spring, ie. clock drift.

Exception in thread "main" java.lang.IllegalStateException: Clock drift in round 10 at 2023-01-10T15:06:39.999832Z which was scheduled at 2023-01-10T15:06:40Z
	at com.example.sprtest.CronJob.main(CronJob.java:81)

(note that I implemented the reproducer as a main in my local version of your original CronJob for convenience, but that it doesn't actually use any Spring infrastructure, just the CronTrigger and SimpleTriggerContext classes)

@sigfridobarb
Copy link

Hi, I have seen the same problem on my project with spring boot 3.0.1, so as thuri I went back to spring boot 2.7.6.
In my project I save a record every 10 second (cron expression "0/10 * * * * *"), so I checked if the issue ever showed up in the past and I found out that from May 2021 to December 2022 it happened (only) 80 times with spring boot 2.x and java 15.
My (small) project run in a docker container over Centos 7.
All this just to say that this problem exist before spring boot 3.x and java 17.

@spring-projects-issues spring-projects-issues added status: feedback-provided Feedback has been provided and removed status: waiting-for-feedback We need additional information before we can continue labels Jan 14, 2023
@simonbasle
Copy link
Contributor

thanks for this data point @sigfridobarb. In light of all of that I'm closing the issue.

If anybody is able to dig into the root cause at the JDK or OS level and that knowledge brings an avenue of improvement to light, feel free to submit a PR.

@simonbasle simonbasle closed this as not planned Won't fix, can't repro, duplicate, stale Jan 17, 2023
@simonbasle simonbasle added status: declined A suggestion or change that we don't feel we should currently apply and removed status: feedback-provided Feedback has been provided labels Jan 17, 2023
@thuri
Copy link
Author

thuri commented Jan 21, 2023

Thanks for investigating this thing. The closing is ok for me as I already have a workaround for my issue.

Perhaps it would be a nice feature if metadata about the scheduling could be injected into the method. That metadata could perhaps contain the last and the next (planned) executionTime.

@simonbasle
Copy link
Contributor

@thuri I've looked a bit at this idea of injecting execution context, but the code is a bit too far removed for this too be practical. Also, only the cron parameter is associated with a Trigger and TriggerContext, so it would be even more work to have a consistent experience depending on the @Scheduled parameters

@simonbasle simonbasle removed their assignment Jan 24, 2023
@vuzem
Copy link

vuzem commented Jun 2, 2023

Hi @simonbasle,

I'm experiencing the same issues with Spring Boot 3 now.

I have a task which gets triggered by a CronTrigger every minute, but now it occurs quite regularly to be triggered twice within the very same minute.

For me it looks like a Spring Boot 3/Spring Framework 6 issue, because when I downgrade to Spring Boot 2 again (also running with Java 17), everything is working fine.

Cheers
Alex

@quaff
Copy link
Contributor

quaff commented Jul 3, 2023

ScheduledExecutorService is not realtime, normally it will delay some milliseconds, and it will ahead 1 millisecond occasionally.
Here is my test code:

import java.time.Duration;
import java.time.LocalDateTime;
import java.util.concurrent.Executors;
import java.util.concurrent.ScheduledExecutorService;
import java.util.concurrent.TimeUnit;

public class Test {

	private static ScheduledExecutorService executor = Executors.newSingleThreadScheduledExecutor();

	public static void main(String[] args) {
		schedule();
	}

	public static void schedule() {
		LocalDateTime now = LocalDateTime.now();
		LocalDateTime scheduled = now.plus(Duration.ofSeconds(1));
		executor.schedule(() -> {
			long delay = Duration.between(scheduled, LocalDateTime.now()).toMillis();
			System.out.println("delta: " + delay + "ms");
			if (delay < 0) {
				System.exit(1);
			}
			schedule();
		}, Duration.between(now, scheduled).toNanos(), TimeUnit.NANOSECONDS);
	}

}

test output like this:

delay: 6ms
delay: 0ms
delay: 1ms
delay: 0ms
delay: 4ms
delay: 3ms
delay: 4ms

3.2 GHz Intel Core i7 iMac 2019 OpenJDK Runtime Environment Temurin-17.0.5+8 (build 17.0.5+8)

@mutatrum
Copy link

mutatrum commented Jul 6, 2023

The testcode from @simonbasle throws an exception on every run for me. However, if I change it to use nanos instead of millis, it runs fine. So I'm not sure if this should be kicked up to JVM/OS level.

I have a process with a cron on second 0, and that triggers early roughly 50% of the time.

I'm running OpenJDK 19.0.2 on Ubuntu 22.04.02.

@jhoeller
Copy link
Contributor

jhoeller commented Jul 7, 2023

@vuzem note that we are using nanosecond precision for scheduling as of 6.0.10 which should address the regression from 5.x: #30666

@Somnium7
Copy link

We have same cron issue with Spring Boot 2.1.18.

@bclozel
Copy link
Member

bclozel commented Jul 10, 2023

@Somnium7 it can't be the same issue since this is about Spring Boot 3.0+. You are using an unsupported Spring Boot version with unpatched CVEs, please upgrade to a supported Spring Boot version as soon as possible.

@Somnium7
Copy link

@bclozel Well, cause of the issue may be different, I know nothing about Spring Boot internals, but cron jobs definitely still could be executed earlier than they should even on old versions.

Thanks for version reminder though,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: declined A suggestion or change that we don't feel we should currently apply
Projects
None yet
Development

No branches or pull requests

10 participants