-
Notifications
You must be signed in to change notification settings - Fork 16
Job Retry
When creating job objects you can configure the timeout
, retryMax
, and retryDelay
options. These options will determine what will happen if a job fails either by a Node.js process not responding or the job process failing. See the Job Options document for more detail.
Every job has a property called dateEnable
which is used to determine if the job is ready for processing. The dateEnable
value is set when the job is created and when it is retrieved from the database for processing. The retrieval query will not return jobs where the dateEnable
value is greater than the current date time value.
Currently the formula used to set the dateEnable
value during the job retrieval process is:
now() + job.timeout + ( job.retryDelay * job.retryCount )
The plan in the future is to move this to an exponential formula once RethinkDB
has a power
function.
As you can see, to disable the retry process and make jobs retry as soon as possible, simply set the retryDelay
to zero.
Note: In the below step through I am stating that the queue review process is re-activating the jobs after failure. This may not be the case with an active queue. If the Queue object is constantly working on jobs then the jobs will be activated as soon as the time is right.
If we take the job default properties and the Queue Master default 'masterInterval' value, which is 310000 milliseconds, then the following sequence of events will occur:
-
The job has never been processed and has default properties. It has been added to the queue.
status = 'waiting'
timeout = 300000
retryCount = 0
retryMax = 3
retryDelay = 600000
dateEnable = dateCreated
-
The job is retrieved from the database setting the
dateEnable
value.
- `status = 'active'`
- `timeout = 300000`
- `retryCount = 0`
- `retryMax = 3`
- `retryDelay = 600000`
- `dateEnable = now() + timeout`
- The job fails for some reason.
- `status = 'failed'`
- `timeout = 300000`
- `retryCount = 1`
- `retryMax = 3`
- `retryDelay = 600000`
- `dateEnable = now() + (retryDelay * retryCount)`
-
The job remains inactive within the database until after
dateEnable
or approximately 600000 milliseconds. -
The Queue Master database review is initiated and the job is retrieved from the database for the first retry.
- `status = 'active'`
- `timeout = 300000`
- `retryCount = 1`
- `retryMax = 3`
- `retryDelay = 600000`
- `dateEnable = now() + timeout + (retryDelay * retryCount)`
- The job fails again for some reason.
- `status = 'failed'`
- `timeout = 300000`
- `retryCount = 2`
- `retryMax = 3`
- `retryDelay = 600000`
- `dateEnable = now() + (retryDelay * retryCount)
-
The job remains inactive within the database until after
dateEnable
or approximately 1200000 milliseconds. -
The Queue Master database review is initiated and the job is retrieved from the database for the second retry.
- `status = 'active'`
- `timeout = 300000`
- `retryCount = 2`
- `retryMax = 3`
- `retryDelay = 600000`
- `dateEnable = now() + timeout + (retryDelay * retryCount)`
- The job fails again. What is wrong with this job?
- `status = 'failed'`
- `timeout = 300000`
- `retryCount = 3`
- `retryMax = 3`
- `retryDelay = 600000`
- `dateEnable = now() + (retryDelay * retryCount)`
-
The job remains inactive within the database until after
dateEnable
or approximately 1800000 milliseconds. -
The Queue Master database review is initiated and the job is retrieved from the database for the third and final retry.
- `status = 'active'`
- `timeout = 300000`
- `retryCount = 3`
- `retryMax = 3`
- `retryDelay = 600000`
- `dateEnable = now + timeout + (retryDelay * retryCount)` _this is redundant however still set_
- The job fails for the last time.
- `status = 'terminated'`
- Because the job status is set to
terminated
it will no longer be retrieved from the database.
As a final note, please review the Job.updateProgress document. This document explains how your job handling function can report progress updates. These progress updates will extend the job timeout counter and update the dateEnable
property. This will prevent the job from failing due to extended job processing time.
- Introduction
- Tutorial
- Queue Constructor
- Queue Connection
- Queue Options
- Queue PubSub
- Queue Master
- Queue Events
- State Document
- Job Processing
- Job Options
- Job Status
- Job Retry
- Job Repeat
- Job Logging
- Job Editing
- Job Schema
- Job Name
- Complex Job
- Delayed Job
- Cancel Job
- Error Handling
- Queue.createJob
- Queue.addJob
- Queue.getJob
- Queue.findJob
- Queue.findJobByName
- Queue.containsJobByName
- Queue.cancelJob
- Queue.reanimateJob
- Queue.removeJob
- Queue.process
- Queue.review
- Queue.summary
- Queue.ready
- Queue.pause
- Queue.resume
- Queue.reset
- Queue.stop
- Queue.drop
- Queue.Job
- Queue.host
- Queue.port
- Queue.db
- Queue.name
- Queue.r
- Queue.id
- Queue.jobOptions [R/W]
- Queue.changeFeed
- Queue.master
- Queue.masterInterval
- Queue.removeFinishedJobs
- Queue.running
- Queue.concurrency [R/W]
- Queue.paused
- Queue.idle
- Event.ready
- Event.added
- Event.updated
- Event.active
- Event.processing
- Event.progress
- Event.log
- Event.pausing
- Event.paused
- Event.resumed
- Event.completed
- Event.cancelled
- Event.failed
- Event.terminated
- Event.reanimated
- Event.removed
- Event.idle
- Event.reset
- Event.error
- Event.reviewed
- Event.detached
- Event.stopping
- Event.stopped
- Event.dropped
- Job.setName
- Job.setPriority
- Job.setTimeout
- Job.setDateEnable
- Job.setRetryMax
- Job.setRetryDelay
- Job.setRepeat
- Job.setRepeatDelay
- Job.updateProgress
- Job.update
- Job.getCleanCopy
- Job.addLog
- Job.getLastLog