-
Notifications
You must be signed in to change notification settings - Fork 11
Troubleshooting
akshat edited this page Nov 7, 2022
·
4 revisions
When running Goose in Production, you might run into some issues. Here's a playbook for them:
- Latency is defined as the time taken to transmit data to/from the processing component
-
execution.latency
: time between enqueue -> start of execution -
scheduled.latency
: time between theoretical schedule time -> start execution
-
- When latency goes up, first check system level metrics of Message Brokers & verify their performance
- If Message Brokers are fine, check Job
Enqueue rate
,Failure rate
&Execution time
- Consider scaling up the number of workers if Jobs are getting enqueued at a higher rate than normal
- If
Failure rate
orExecution times
are unusual, investigate issue with code/third party APIs
- If scheduling latency is high, consider lowering
scheduler-polling-interval-sec
in Redis
- A Job might be causing process crashes
- To find the Poison Job, track
jobs.recovered
metric & look for:function
tag- If a Job causes workers to crash repeatedly, it'll be recovered & tagged by Goose
Previous: Redis Production Readiness Next: Glossary
Home | Getting Started | RabbitMQ | Redis | Error Handling | Monitoring | Production Readiness | Troubleshooting
Need help? Open an issue or ping us on #goose @Clojurians slack.