[WIP] Issues with seqNo and Eventhub Retention #408
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
It feels like we're having some issues with the retention of the Event Hub.
When starting a Spark job that reads from a consumer group that already has some pruned messages
because then run out of retention.
First I need to fix another issue with the Simulator, which I'll do in another PR to keep everything nicely separated. Currently, the latest seqNo is implemented by taking the
size()
of the messages within a certain partition. This is not right, we want to take the max of the seqNo of the actual messages:azure-event-hubs-spark/core/src/main/scala/org/apache/spark/eventhubs/utils/SimulatedEventHubs.scala
Line 252 in 61aba38
Thanks for contributing! We appreciate it :)
For a Pull Request to be accepted, you must:
.scalafmt.conf
present in this projectmvn clean test
Just in case, here are some tips that could prove useful when opening a pull request: