You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Open erik-stephens opened this issue on Jan 11 · 1 comment Comments
Assignees No one—
Labels
enhancement
priority:high
Projects None yet
Milestone No milestone
Notifications
You’re receiving notifications because you were mentioned.
2 participants @erik-stephens @kstaken
Lock conversation @erik-stephens
Member
erik-stephens commented on Jan 11
The new kafka slicer will need to know the offsets that the worker consumed up to. There needs to be a way for the worker to communicate that back to the slicer. Does such a mechanism already exist? Chatted with @jsnoble and he didn't think so.
@kstaken kstaken added enhancement priority:high labels on May 30 @kstaken
Member
kstaken commented on May 30
We need a solution for storing offsets for the new kafka reader.
I think there are a couple possible solutions.
store the committed offsets in the state record. On the surface this is good however state records are associated with a particular execution so stopping / restarting a job would result in offsets being lost. It would require a job _recover to restore it and that is not really viable operationally. Positive attributes though is that this does bring the ability to run once jobs and to recover offsets.
Use a general state storage mechanism to store the offsets. This would build on the general state storage mechanism that we've been discussing to store the offsets in external storage (likely ES). Using this approach the job could recover automatically but you lose the recover benefits of storing explicit offsets.
Maybe a combination of both mechanisms. Downside is that these will not be atomic update operations so the risk of inconsistency will be high. @kstaken kstaken changed the title from Communicate slice metadata to slicer to Offset storage mechanism for new kafka reader on May 30 @jsnoble
The text was updated successfully, but these errors were encountered:
Open erik-stephens opened this issue on Jan 11 · 1 comment Comments
Assignees No one—
Labels
enhancement
priority:high
Projects None yet
Milestone No milestone
Notifications
You’re receiving notifications because you were mentioned.
2 participants
@erik-stephens
@kstaken
Lock conversation
@erik-stephens
Member
erik-stephens commented on Jan 11
The new kafka slicer will need to know the offsets that the worker consumed up to. There needs to be a way for the worker to communicate that back to the slicer. Does such a mechanism already exist? Chatted with @jsnoble and he didn't think so.
@kstaken kstaken added enhancement priority:high labels on May 30
@kstaken
Member
kstaken commented on May 30
We need a solution for storing offsets for the new kafka reader.
I think there are a couple possible solutions.
store the committed offsets in the state record. On the surface this is good however state records are associated with a particular execution so stopping / restarting a job would result in offsets being lost. It would require a job _recover to restore it and that is not really viable operationally. Positive attributes though is that this does bring the ability to run once jobs and to recover offsets.
Use a general state storage mechanism to store the offsets. This would build on the general state storage mechanism that we've been discussing to store the offsets in external storage (likely ES). Using this approach the job could recover automatically but you lose the recover benefits of storing explicit offsets.
Maybe a combination of both mechanisms. Downside is that these will not be atomic update operations so the risk of inconsistency will be high.
@kstaken kstaken changed the title from Communicate slice metadata to slicer to Offset storage mechanism for new kafka reader on May 30
@jsnoble
The text was updated successfully, but these errors were encountered: