-
Notifications
You must be signed in to change notification settings - Fork 11
Error Handling & Retries
When a Job throws an uncaught exception during execution, Goose treats that as a failure. There are multiple configurations to handle a Failed Job as it goes through these steps:
- Job's
error-handler
is called - If retries are remaining, Job is
scheduled for retry with backoff
- Failed jobs can be executed from a different
retry-queue
as well - If retries are exhausted,
death-handler
is called & Job is marked as dead - Upon death, Job is stored in
Dead Jobs queue
- Storage of Dead Jobs can be skipped as well using a config
Since Goose jobs run in background, it is considered a good practice to integrate error services like Sentry, Honeybadger, etc. as error handlers.
Error & Death handlers are fully-qualified function symbols that accept config
, job
& exception
. They must be configured by Client & will be called during Job failure by Worker.
-
:error-handler-fn-sym
Called when a job has failed, and will be scheduled for execution -
:death-handler-fn-sym
Called when a job has exhausted retries and won't be executed again
(ns honeybadger
(:require
[goose.client :as c]
[goose.retry :as retry]
[goose.worker :as w]
[honeybadger.core :as hb]))
(defn hb-error-handler
[cfg job ex]
(hb/notify cfg ex job))
; Add Honeybadger error handler.
(let [retry-opts (assoc retry/default-opts :error-handler-fn-sym `hb-error-handler)
client-opts (assoc client-opts :retry-opts retry-opts)]
(c/perform-async client-opts `my-failing-fn :foo))
; Inject Honeybadger config in Worker.
(let [hb-config {:api-key "d34db33f"
:env "development"}]
(w/start (assoc worker-opts :error-service-cfg hb-config)))
(ns sentry
(:require
[goose.client :as c]
[goose.worker :as w]
[sentry-clj.core :as sentry]))
; Ignore first arg as Sentry is pre-initialized.
(defn sentry-death-handler
[_ job ex]
(sentry/send-event
{:message (str "Job died: " (:id job))
:throwable ex}))
; Add Sentry as death handler.
(let [retry-opts (assoc retry/default-opts :death-handler-fn-sym `sentry-death-handler)
client-opts (assoc client-opts :retry-opts retry-opts)]
(c/perform-async client-opts `my-dying-fn :foo))
; Init Sentry config.
(sentry/init! "https://public:private@sentry.io/1")
; No need to inject sentry config in worker.
(w/start w/default-opts)
To prevent main queue from getting clogged by failed jobs, a different queue can be configured using :retry-queue
option.
By default, Goose retries a job 27 times; and can be modified by :max-retries
option.
Goose will retry failures with an exponential backoff using the formula (retry_count ** 4) + 20 + (rand(20) * (retry_count + 1))
(i.e. , 28, 51, 66, 177, ... seconds). Goose will perform 27 retries over approximately 30 days. Assuming new code gets deployed & bug gets fixed within that time, the job will get automatically retried and successfully processed. After 27 times, Goose will move that job to the Dead Job queue, assuming that it will need manual intervention to work.
Retry delay can be modified using :retry-delay-sec-fn-sym
option.
When retries are exhausted, Job won't be stored in Dead Jobs queue
when :skip-dead-queue
is set to true.
Dead jobs can deleted or replayed using API
Sometimes, workers might crash abruptly & in-progress Jobs might not be completed. Such abandoned Jobs are called orphan-jobs
& will be picked up by another worker process. Jobs must be Idempotent for such scenarios.
(ns error-handling
(:require
[goose.client :as c]
[goose.worker :as w]
[clojure.tools.logging :as log]))
(def error-service-config {:my :config})
(defn my-error-handler [cfg job ex]
(log/error cfg job ex))
(defn my-death-handler [cfg job ex]
(log/error cfg job ex))
(defn my-retry-delay
[retry-count]
(+ (* (rand-int 30) (inc retry-count))
(reduce * (repeat 2 retry-count)))) ; retry-count^2
(let [retry-opts {:max-retries 10
:retry-delay-sec-fn-sym `my-retry-delay
:retry-queue "my-retry-queue"
:error-handler-fn-sym `my-error-handler
:death-handler-fn-sym `my-death-handler
:skip-dead-queue false}
client-opts (assoc client-opts :retry-opts retry-opts)]
;; Retry options can be configured for scheduled jobs too.
(c/perform-in-sec client-opts 300 `my-failing-fn :foo))
(let [worker-opts (assoc worker-opts :error-service-config error-service-config)
worker (w/start worker-opts)
failed-jobs-worker (w/start (assoc worker-opts :threads 2
:queue "my-retry-queue"))])
Previous: Cron Jobs Next: Monitoring & Alerting
Home | Getting Started | RabbitMQ | Redis | Error Handling | Monitoring | Production Readiness | Troubleshooting
Need help? Open an issue or ping us on #goose @Clojurians slack.