-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#209 - Add Support for Sliding Invisibility Timeouts #353
#209 - Add Support for Sliding Invisibility Timeouts #353
Conversation
This uses the same fetching as normal invisibility timeout except with an added background update to update the FetchedAt time regularly This is disabled by default and must be set in PostgreSqlStorageOptions SlidingInvisibilityTimeout to enable it's usage. Fixed two issues with tests where tests were asserting against record.FetchedAt but the record returned is the database column names so should be record.fetchedat
… timeouts are enabled to reduce server load if not enabled Added additional documentation to SlidingInvisibilityTimeout to indicate that this and IsLightweightServer are not compatible
One thing I don't really like right now overall is that it's a separate property in storage options. With current API, it's not possible to "unset" the "regular" invisibility timeout without introducing possibly breaking changes (making the existing property nullable). Can we have a boolean instead |
That makes alot of sense. I was trying to keep it as separate as possible from the existing settings and also align the naming with the SQL Server Storage options but perhaps better to keep it simple. That makes the changes more straight forward as well since same timeouts settings. I'll add the |
…ityTimeout to enable/disable using sliding timeout and maintain existing InvisibilityTimeout setting for both sliding and existing invisibility timeout options.
Thank you! |
This adds the sliding invisibility timeouts functionality (from #209) which is ported from SQL Server storage.
(This also ports a change where when removing from queue or requeuing, it will include the fetchedat in the where clause. See this commit (Check FetchedAt has expected value to prevent prolonging other's work).)
As noted in the original ticket, sliding invisibility timeouts is better than invisibility timeouts since as long as the worker instance is running, it will never reassign the job to another worker. This is much better than setting the invisibility timeout to long timeouts for long running jobs.
The SQL Server storage also implements transaction dequeuing (different to what is implemented here now with the native transaction setting) where it holds an open transaction while the job is running. This has advantages over using the sliding invisibility timeout (which is why it is the default in SQL Server storage) but it has limitations for very long running jobs since it can hold a transaction for a long period of time which has caused issues with backups on SQL Server (see this ticket #864).
Like SQL Server storage, it makes sense to allow both options (and longer term remove the older non-sliding invisibility timeout) rather than implementing just the newer open transaction dequeuing so users can choose the implementation that suits their use case (ie. short running jobs or long running jobs).
This PR implements sliding invisibility timeouts since this easily extends the current dequeuing functionality and is easily ported across from SQL Server storage.
How this works is it utilizes the existing dequeue functionality but adds a second configurable timeout setting
SlidingInvisibilityTimeout
which when set to a value enables using sliding invisibility timeout. When a job is fetched, it adds the job to the background storage heartbeat process, called from the background server which updates the fetchedat for each job.This works with both native transactions enabled and disabled.
In detail, given this is basically ported from SQL Server storage it works in basically the same fashion:
ExecuteKeepAliveQueryIfRequired is run 5 times within the sliding invisibility timeout range.
It should be noted that setting the SlidingInvisibilityTimeout to lower than 10 seconds gets very unstable since sometimes it cannot update the fetchedat quick enough before the job is fetched and so duplicate jobs start running. Ideally this should never be set lower than 1 minute (in my opinion).
The other caveat is that if the worker process has
IsLightweightServer
set to true, then sliding invisibility timeouts will not work at all because it relies on the background storage processes running. There doesn't seem to be any easy way around this without a change to Hangfire.Core. Hopefully longer term, it will allow storage libraries to either add background processes directly or provide more information to allow them to decide which processes should run when IsLightweightServer is set.