You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When the dispatcher receives a batch of compaction jobs, it checks the state store to see if all the jobs have been assigned to their input files. At time of writing, it only sends the batch if all the jobs have been successfully assigned.
It may be possible to put the system into a state where some of the jobs have been assigned to their input files and some have not.
With the transaction log state store, file assignments are done for a whole batch in a single transaction, which will either succeed or fail. This means that we can't get to a state where a batch is partially assigned to its input files.
With the DynamoDB state store, or any state store that splits file assignment into multiple transactions, the file assignment for a single batch may be done gradually. The dispatcher may see a state where some of the jobs have had their files assigned successfully but others have not. The file assignments may still be in progress, or may have failed completely.
Description
We'd like to define how the system should behave when a pending batch of compaction jobs has been partially assigned to its input files.
Analysis
At time of writing, CompactionJobDispatcher calls StateStore.isAssigned, which throws an exception if any of the file references are assigned to a different job or have been removed. We probably need to do something to change this. At a minimum we can write a unit test for CompactionJobDispatcher to show its behaviour in this case.
We could consider some extra handling so that when we get a batch with some jobs assigned and some not, we can send just those jobs. We'd need to decide what to do with the rest of the batch, as we don't know whether the file assignments are still in progress or have failed completely.
We could wait until all the input files have definite assignments before we proceed with the batch. We could return it to the pending queue with a delay when we see a partially assigned state. If we see a state where all assignments are known but not all the jobs are assigned successfully, we could just send the successfully assigned jobs.
The text was updated successfully, but these errors were encountered:
patchwork01
changed the title
Clarify behaviour when dispatcher encounters a partially assigned batch
Clarify behaviour when compaction dispatcher encounters a partially assigned batch
Nov 18, 2024
patchwork01
changed the title
Clarify behaviour when compaction dispatcher encounters a partially assigned batch
Handle partially assigned batches in compaction dispatcher
Nov 19, 2024
Background
Split from:
When the dispatcher receives a batch of compaction jobs, it checks the state store to see if all the jobs have been assigned to their input files. At time of writing, it only sends the batch if all the jobs have been successfully assigned.
It may be possible to put the system into a state where some of the jobs have been assigned to their input files and some have not.
With the transaction log state store, file assignments are done for a whole batch in a single transaction, which will either succeed or fail. This means that we can't get to a state where a batch is partially assigned to its input files.
With the DynamoDB state store, or any state store that splits file assignment into multiple transactions, the file assignment for a single batch may be done gradually. The dispatcher may see a state where some of the jobs have had their files assigned successfully but others have not. The file assignments may still be in progress, or may have failed completely.
Description
We'd like to define how the system should behave when a pending batch of compaction jobs has been partially assigned to its input files.
Analysis
At time of writing, CompactionJobDispatcher calls StateStore.isAssigned, which throws an exception if any of the file references are assigned to a different job or have been removed. We probably need to do something to change this. At a minimum we can write a unit test for CompactionJobDispatcher to show its behaviour in this case.
We could consider some extra handling so that when we get a batch with some jobs assigned and some not, we can send just those jobs. We'd need to decide what to do with the rest of the batch, as we don't know whether the file assignments are still in progress or have failed completely.
We could wait until all the input files have definite assignments before we proceed with the batch. We could return it to the pending queue with a delay when we see a partially assigned state. If we see a state where all assignments are known but not all the jobs are assigned successfully, we could just send the successfully assigned jobs.
The text was updated successfully, but these errors were encountered: