Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE REQ] Dead-letter and retry mechanism for Event Hub #18344

Closed
kacheng2018 opened this issue Dec 24, 2020 · 4 comments
Closed

[FEATURE REQ] Dead-letter and retry mechanism for Event Hub #18344

kacheng2018 opened this issue Dec 24, 2020 · 4 comments
Assignees
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Event Hubs feature-request This issue requires a new behavior in the product in order be resolved. Service Attention Workflow: This issue is responsible by Azure service team.
Milestone

Comments

@kacheng2018
Copy link

The customer experiences exception failure and application abort due to failures in event hubs for sending / receiving messages, and the messages will be overflown if the application cannot be recovered in-time. Currently, the event hub does not provide a mechanism for dead-letter handling and re-try policy where it will incur risk of messages lost. Requesting to add dead-letter and retry mechanism for event hub in the SDK

@ghost ghost added needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. customer-reported Issues that are reported by GitHub users external to the Azure organization. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that labels Dec 24, 2020
@alzimmermsft alzimmermsft added Client This issue points to a problem in the data-plane of the library. Event Hubs feature-request This issue requires a new behavior in the product in order be resolved. labels Dec 28, 2020
@ghost ghost removed the needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. label Dec 28, 2020
@conniey conniey added the Service Attention Workflow: This issue is responsible by Azure service team. label Feb 20, 2021
@ghost
Copy link

ghost commented Feb 20, 2021

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @jfggdl.

Issue Details

The customer experiences exception failure and application abort due to failures in event hubs for sending / receiving messages, and the messages will be overflown if the application cannot be recovered in-time. Currently, the event hub does not provide a mechanism for dead-letter handling and re-try policy where it will incur risk of messages lost. Requesting to add dead-letter and retry mechanism for event hub in the SDK

Author: kacheng2018
Assignees: -
Labels:

Client, Event Hubs, Service Attention, customer-reported, feature-request, question

Milestone: -

@conniey
Copy link
Member

conniey commented Feb 20, 2021

This is a feature that AFAIK isn't available in the Azure Event Hubs service. @JamesBirdsall would be able to provide more accurate information.

@JamesBirdsall
Copy link
Contributor

Event Hubs does not have the concept of dead-lettering because messages are not consumed destructively. Messages only become unavailable when they exceed the retention period and are purged. At any time, an application can open a receiver to consume any message that has not expired, and can re-open receivers to re-receive a particular message as many times as necessary until it has been processed properly.

Speaking very generally, checkpointing is a common pattern, in which an application persistently stores the offset (or sequence number) of the last message processed successfully for each partition of the event hub. Normally it is not necessary for the application to re-receive the message because it can simply keep trying to process the copy of the message that it already has, but if it is necessary to restart the receiver for any reason (application shutdown and restart, application crash and restart, VM crash and restart, etc.) then the checkpointed offset indicates the position where the restarted receiver should begin in the partition.

The producing side is simpler. If the send to the event hub succeeds, then our service has persisted the message and the producing application can move on to the next message. If the send fails, our SDKs will automatically retry if the error is transient, up to the timeout specified. If it is still failing, the SDK will throw and what to do then is up to your code, but you probably want to retry at that level too. If it keeps failing, you should have some sort of application monitoring so you can open a support ticket.

@joshfree joshfree removed the question The issue doesn't require a change to the product in order to be resolved. Most issues start as that label Feb 22, 2021
@joshfree joshfree added this to the Backlog milestone Feb 22, 2021
@ramya-rao-a
Copy link
Contributor

Thanks for the clarification @JamesBirdsall

@kacheng2018 We hope that have answered your queries.
Please do open a new issue if you have any other feedback for us.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Event Hubs feature-request This issue requires a new behavior in the product in order be resolved. Service Attention Workflow: This issue is responsible by Azure service team.
Projects
None yet
Development

No branches or pull requests

6 participants