-
Notifications
You must be signed in to change notification settings - Fork 356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't checkpoint when Event Hubs Listener is closing #1752
Comments
I agree this looks like an issue. Here's an example sequence:
@mathewc Can you see any mistakes in the above reasoning? |
Agreed - this seems like an issue. To verify, I recommend we create a simple repro that demonstrates the problem that we can use to verify the fix. |
Is tricky to repro is some cases as it relies on a race condition, that said I'm not sure how to simulate a |
Possibly related: ICM 73233348 Also, I chatted with @MikeStall on this code, and his recollection was that it comes from sample listener code that he got from EH samples, e.g. as in this issue. As you can see, the SimpleEventProcessor in that issue checkpoints on close. However that simple sample only Closes the listener when the app is shutting down and all events are processed. Also it does no checkpointing in the message processor. So for that simple sample it won't checkpoint any unprocessed events. Our case is different - with our cancellation token and execution model, we CAN checkpoint for cancelled/unprocessed events in rare cases. |
Thanks for looking into this. Yes that ICM is what first made me inspect the code and find a potential gap. I don't think necessarily this is what they were hitting but is a possibility as it is clear during the logs of their incident during data loss was a transition of partition listeners between instances (one instance took a partition lock away from another) - so possible is related |
Here we do a checkpoint if an EventHubListener is closing because of a server shutdown
azure-webjobs-sdk/src/Microsoft.Azure.WebJobs.Extensions.EventHubs/EventHubListener.cs
Lines 118 to 121 in 7009477
I'm not sure why we would want to checkpoint in this case. Since this appears to be invokable in parallel to an execution that may be in progress (and not yet completed), seems it could result in an invalid checkpoint for in flight data.
Seems we shouldn't have these lines.
The text was updated successfully, but these errors were encountered: