Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combine blocking and non-blocking error handling #2116

Closed
ghost opened this issue Feb 18, 2022 · 7 comments
Closed

Combine blocking and non-blocking error handling #2116

ghost opened this issue Feb 18, 2022 · 7 comments

Comments

@ghost
Copy link

ghost commented Feb 18, 2022

As before discussed here on Github.

Expected Behavior

A combination of blocking and non-blocking retry should be possible to handle three different types of exceptions:

  • non-blocking (but not fatal) exceptions should be retried.

  • non-blocking (but fatal, e.g. DeserializationException) exceptions should be routed to the DLT directly.

  • blocking exceptions (e.g. a database is currently not available) should not be routed to a retry topic since all following messages would be routed there as well. The consumer should remain at the current offset.

Current Behavior
It is possible to use @RetryableTopic annotation to satisfy the first two bullets. It is also possible to use DefaultErrorHandler to implement the last bullet. But a combination of both approaches cannot be configured at the moment.

Context
@garyrussell mentioned a workaround by stopping the container using the stopImmediate container property. Nevertheless, there are a few blog articles that address the different handling of error types, e.g., this article. Therefore, having a proper way to configure this behavior would be a useful possibility.

@tomazfernandes
Copy link
Contributor

Thanks for the suggestion @mariecuriiee, that's a good article.

I'm currently working on a couple of issues that should pave the way for this feature to work, by leveraging the existing DefaultErrorHandler we already set for the RetryableTopics. So hopefully we'll have this implemented soon.

tomazfernandes added a commit to tomazfernandes/spring-kafka that referenced this issue Feb 20, 2022
Before we hardcoded a no-ops back off in the DefaultErrorHandler used in the Retryable Topics feature.
Adds a setter to let the user provide their own back off policy and configure blocking retries in conjunction with RT.
@tomazfernandes
Copy link
Contributor

Hi @mariecuriiee, I've added a PR with this feature, maybe you want to take a look:
#2124

LMWYT, thanks

tomazfernandes added a commit to tomazfernandes/spring-kafka that referenced this issue Feb 21, 2022
Before we hardcoded a no-ops back off in the DefaultErrorHandler used in the Retryable Topics feature.
Adds a setter to let the user provide their own back off policy and configure blocking retries in conjunction with RT.
@ghost
Copy link
Author

ghost commented Feb 22, 2022

Hey @tomazfernandes, thanks for implementing this so quickly! If I understand the implementation correctly, I only have to set the backoff using setBlockingRetryBackoff and add a classification behavior using setErrorHandlerCustomizer? And with setErrorHandlerCustomizer I could then define the behavior depending on the exception, e.g., for some exceptions the records go directly to the retry topics, for other exceptions, the records are handled using blocking retries or for other exceptions, the records are routed directly to the DLT?

@tomazfernandes
Copy link
Contributor

Hi @mariecuriiee, thanks for looking into the solution. We’re currently reviewing it to improve the classification for blocking retries and make it allowlistable instead of the current denylist only behavior.

Either way, you’d have to configure the classification for each type of retry, i.e. define which exceptions are fatal for the blocking retries, and which are fatal for the non-blocking retries.

So if you want a non-default fatal exception to go straight to the DLT, you’d have to add it to both classifications. With this you can have any behavior you want - both retries, only one or another, or no retries.

We’re also in the process of reviewing the configuration API for the retryable topics feature as a whole, so when that comes out we’ll probably have a nicer API for configuring this.

As a side note, we’ve just introduced a way of setting global fatal exceptions for the non-blocking retries, maybe that’s something you’ll want to check out too.

@ghost
Copy link
Author

ghost commented Feb 22, 2022

@tomazfernandes that makes sense, thanks for clarification!
I thought about Gary's comment again and I agree that it would be more intuitive like he described. Even though this means that this feature won't be released in 2.8.3 😢

But as soon as it is released, I will use it in my project 😄 I also had a quick look at the global fatal exceptions - will test that too!

tomazfernandes added a commit to tomazfernandes/spring-kafka that referenced this issue Feb 23, 2022
Before we hardcoded a no-ops back off in the DefaultErrorHandler used in the Retryable Topics feature.
Adds a setter to let the user provide their own back off policy and configure blocking retries in conjunction with RT.
garyrussell pushed a commit that referenced this issue Feb 24, 2022
* GH-2116: Add blocking retries to RT

Before we hardcoded a no-ops back off in the DefaultErrorHandler used in the Retryable Topics feature.
Adds a setter to let the user provide their own back off policy and configure blocking retries in conjunction with RT.

* Change DHE in LCFC to defaultFalse

With this we no longer need a no ops back off.
Some minor adjustments were needed to maintain behavior when the logic gets to DLPR.

* Change DHE in LCFC to defaultFalse

With this we no longer need a no ops back off.
Some minor adjustments were needed to maintain behavior when the logic gets to DLPR.

* Improve API and docs

Now retryable exceptions can be set directly in the lcfc class.
Improved the docs on how to combine blocking and non-blocking behaviors.
Added what's new entry for this feature.

* Improve ExceptionClassifier JavaDoc

Also add assertions to the LCFC new methods to warn the user if they already set the blocking configurations.
garyrussell pushed a commit that referenced this issue Feb 24, 2022
* GH-2116: Add blocking retries to RT

Before we hardcoded a no-ops back off in the DefaultErrorHandler used in the Retryable Topics feature.
Adds a setter to let the user provide their own back off policy and configure blocking retries in conjunction with RT.

* Change DHE in LCFC to defaultFalse

With this we no longer need a no ops back off.
Some minor adjustments were needed to maintain behavior when the logic gets to DLPR.

* Change DHE in LCFC to defaultFalse

With this we no longer need a no ops back off.
Some minor adjustments were needed to maintain behavior when the logic gets to DLPR.

* Improve API and docs

Now retryable exceptions can be set directly in the lcfc class.
Improved the docs on how to combine blocking and non-blocking behaviors.
Added what's new entry for this feature.

* Improve ExceptionClassifier JavaDoc

Also add assertions to the LCFC new methods to warn the user if they already set the blocking configurations.
@tomazfernandes
Copy link
Contributor

@mariecuriiee, we’ve implemented the solution, don’t know if you had the chance to take a look. It’s on 2.8.4-SNAPSHOT, and referenced in the docs: https://docs.spring.io/spring-kafka/docs/2.8.4-SNAPSHOT/reference/html/#retry-topic-combine-blocking

Thanks again for the suggestion, and feel free to provide any feedback.

@garyrussell, DYT there’s any more work to be done regarding this issue, or maybe we could close it?

Thanks

@garyrussell
Copy link
Contributor

Thanks @tomazfernandes ; closing; resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants