Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does Quartz's @DisallowConcurrentExecution work with multiple nodes? #761

Closed
joaoportela opened this issue Jan 20, 2022 · 3 comments
Closed

Comments

@joaoportela
Copy link

joaoportela commented Jan 20, 2022

@DisallowConcurrentExecution documentation doesn't clarify whether or not this works across a cluster.

I found some comments online that indicate that it only works on per-node basis.

But looking at other sources, it seems that it is expected that it works across cluster nodes.

I'm inclined to believe that it should work in cluster mode, but I would appreciate a final answer on this:

If I have multiple triggers for the same job (same job key), will the @DisallowConcurrentExecution annotation ensure that only one instance is running at a time in the entire cluster?

@GaoForGot
Copy link

GaoForGot commented Feb 14, 2022

Encountered relevant problem, here is my scenario,
I set up one job(one jobdetail) with @DissallowConcurrentExecution annotated running on a cluster with two nodes . The job is triggered by cronTrigger, the cron expression is (30/30 * * * * ? *) . The cluster uses jdbcjobstore. acquireTriggersWithinLock is set with true.
The result I expected is that the job triggered --> running --> finished by two nodes without concurrent overlap, however, they overlap like this occasionally:
node_1 start: 2022-01-25 14:09:25 end: 2022-01-25 14:19:25
node_2 start 2022-01-25 14:10:00 end: 2022-01-25 14:10:18
node_1's firing was actually a misfire handling(fire once now policy by default), it seems like node_2 didn't check the status of the trigger, acquired and executed it.
The concurrent running of this job has caused lots of unexpected problem, It's been troubling me for a month(searched everywhere, read the code, test by myself, still no clue, I'm not that good T-T). since joaoportela's question is strongly correlated with mine, so I post it here, hopefully I could get some help. Please let me know if I miss any info that can help solving the problem.

@jhouserizer
Copy link
Contributor

Yes, it works with multiple nodes - it does so my marking all triggers of the job as "blocked" until execution finishes. It should be impossible for this not to happen if locking is enabled, see this block of code

if (job.isConcurrentExectionDisallowed()) {
).

If concurrent executions are observed then that would be a bug that needs a replication test case and issue filed.

@ckuehne
Copy link

ckuehne commented Jul 22, 2022

@jhouserizer
I can see that the triggerFired() method you link is executed in TXLock (e.g. executeInNonManagedTXLock()) but storeTrigger() when called (transitively) from recoverMisfiredJobs() ist not executed in a TXLock.

Just a guess:
Could there be a race condition: triggerFired() does updateTriggerStatesForJobFromOtherState for triggers of a DissallowConcurrentExecution job while recoverMisfiredJobs() -> storeTrigger() adds a new trigger for this job, and both are executed concurrently on different machines?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants