1.3.0: Better handling of failure cases
What's Changed
Better handling of errors in subscribers
We were not handling subscriber errors aggressively enough. The idea was that, because publish and subscribe are decoupled, a failure in a subscriber should NOT be a critical error in the app- it should be a failure of just that subscriber. And we log the error out of convenience.
This makes sense in theory, in a truly distributed pub/sub system, but in practice, clients depend on subscriptions to publish properly; for example, if model events fail to publish (because, say, they enqueue a job into Redis) many critical components of a system will not work.
Instead, the default behavior should be to raise an error if any subscriber fails. This would cause code that fails to publish to raise an error on the publish
call, returning a 500, failing the job, etc.
To continue the old behavior, you can use Amigo.on_publish_error = proc {}
.
This also improves the logging so that we log the full event details, rather than just the representation of the object. This means that, if something critical fails to publish, there is still a record of the event for future use in repairs.
Handle audit logger perform_async failure
If the audit logger job cannot perform_async
, we would get an error in the subscriber. This is not great, since it means we don't get the event audited.
Instead, if we cannot .perform_async
we can .new.perform
. This ensures we still get the event audited, even if Redis is down.
See #7
Full Changelog: v1.2.2...v1.3.0