Skip to content

Conversation

2010YOUY01
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

Background for memory-limited sort execution

See figures in https://github.com/apache/datafusion/blob/main/datafusion/physical-plan/src/sorts/sort.rs

Current limitation for reading back spill files

Let's say the memory buffer can only hold 10 batches, but there are 900 files spilled, the current implementation will try to merge 900 files at once, and fail the query.
However, this scenario is possible to continue if it can only merge 10 files at a time, re-spill, until there are only less than 10 spill files in total, and finally read them back and merge them as the output.

High-level approach of this PR

Added one configuration option for max spill merge degree (haven't implemented now, in POC it's a hard-coded const MAX_SPILL_MERGE_DEGREE for simplicity)
At the final stage of external sort, there are initial spill files to merge, perform multi-pass read-merge-respill operation, the number of merged spill file in the next pass is decided by the closet power of MAX_SPILL_MERGE_DEGREE

Example:
Initial spill files to merge: 900
max merge degree: 10

pass1: merge 900 files to 100 (closet power of 10) files
pass2: 100 -> 10
pass3: 10 -> final single output

Inside each pass the number of files to merge in each step is split as even as possible, while is always <= max merge degree, see details in the implementation.

What changes are included in this PR?

Updated the sort.rs for multi-pass reading spill. The entry point for the described logic is function merge_spilled_files_multi_pass()

Are these changes tested?

To be done, I think it's adding varying max_spill_merge_degree to the sort fuzzer.

Are there any user-facing changes?

No

@github-actions github-actions bot added the core Core DataFusion crate label Apr 7, 2025
@rluvaton
Copy link
Member

rluvaton commented Apr 7, 2025

BTW, row_hash uses the sort preserving merge stream as well and has similar problem, I think this should be a solution outside the sort exec

}

/// Maximum number of spill files to merge in a single pass
const MAX_SPILL_MERGE_DEGREE: usize = 8;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be configurable based on the number of available tokio blocking tasks I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's only for POC.

@rluvaton
Copy link
Member

rluvaton commented Apr 7, 2025

Also, to have a fully working larger than memory sort, you need to spill in

.try_grow(get_record_batch_memory_size(&batch))?;

In case the memory reservation is failing

@2010YOUY01
Copy link
Contributor Author

This PR and #15608 both implemented multi-level merge for SortExec for different purposes:

This PR

  • This PR wants to let memory-limit sort queries be able to run even if the memory budget is very tight (i.e. num-spill-files * batch-size > memory limit)
  • Always re-spill for each merge step

#15608

  • Reduces merge degree for performance (reading spills will stall for a shorter amount of time)
  • Never re-spill

I think we should refine the existing PR to be:

  1. Prioritize stable execution of memory-limited queries over performance.
    • I think the optimizations mentioned below are somewhat complex. We should first resolve the remaining known correctness issues in external sort, strengthen the tests, and then proceed with later optimizations more confidently.
  2. Extensible for future performance optimization

To summarize, I think this PR needs to be restructured to make future optimizations easier to implement. I don’t have a solid idea yet, so I’ll keep thinking and also wait to hear more opinions.

@2010YOUY01
Copy link
Contributor Author

BTW, row_hash uses the sort preserving merge stream as well and has similar problem, I think this should be a solution outside the sort exec

I think the spilling-related problem in external aggregation is still larger-than-memory sort, the current aggregation implementation tries to re-implement the sort spilling logic which is already done in ExternalSorter. So the implementation is reusable by row_hash (with some modifications)

@2010YOUY01
Copy link
Contributor Author

Also, to have a fully working larger than memory sort, you need to spill in

.try_grow(get_record_batch_memory_size(&batch))?;

In case the memory reservation is failing

Could you elaborate? I don't get it.

@alamb alamb mentioned this pull request Apr 7, 2025
12 tasks
// Recursively merge spilled files
// ──────────────────────────────────────────────────────────────────────
let spill_files = std::mem::take(&mut self.finished_spill_files);
let spill_files = self.recursively_merge_spill_files(spill_files).await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can avoid recursion here if we don't have to use it?

The maximum number of pass of multi-pass external merge sort is "Total Passes = 1 (initial run) + ⌈log_d (number of runs)⌉" for d way merge sort. We can use this information to convert the recursion into a for loop (recursion has many performance disadvantages compare to loop).

}

/// Recursively merges and re-spills files until the number of spill files is ≤ MAX_SPILL_MERGE_DEGREE
async fn recursively_merge_spill_files(
Copy link
Contributor

@qstommyshu qstommyshu Apr 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this name misleading, looks like this is not a recursive function. Maybe we can change it to something more descriptive?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, updated. I changed my mind midway through implementing this function.

let sort_exprs: LexOrdering = self.expr.iter().cloned().collect();

// ==== Doing sort-preserving merge on input partially sorted streams ====
let spm_stream = StreamingMergeBuilder::new()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this StreamingMergeBuilder uses heap under the hood, using heap is a common method to optimize external merge sort performance

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's an in-house implementation of loser-tree heap. If we don't limit the degree of merge, for large sort queries, this step is the bottleneck now, maybe there is some room to optimize inside 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is already pretty well optimized (not that it couldn't be made better) but there isn't a lot of low hanging fruit in my opinion

Copy link
Contributor

@qstommyshu qstommyshu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some minor issues, the algorithm itself looks good to me in general. I can take a closer look in details if needed (I'm not familiar with this part of the codebase yet, but I'll try my best to provide good review comments).

And some other thoughts:

  1. This is a pretty complicated program, maybe we should write some unit tests to make sure it doesn't break for future modifications?
  2. One idea to improve the performance is to dynamic calculate the optimal merge degree based on file size and memory size, or maybe multi-thread the merge phase (not sure if it is feasible)

let mut sorted_stream =
let sorted_stream =
self.in_mem_sort_stream(self.metrics.baseline.intermediate())?;
debug!("SPM stream is constructed");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean up debug logs if they are not needed?

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very cool -- thank you @2010YOUY01 and @rluvaton and @qstommyshu

I think in general (can do as a follow on PR) we will need to introduce some parallelism as well to really maximize performance.

Specifically, the merge operation is fundamentally single threaded (the hot loop in the merge). Thus I would expect merging one set of 10 files to likely be CPU bottlenecked

So we probably would need to try and merge multiple sets of 10 files in parallel (to multiple different output files) before we either bottlenecked on CPU or on I/O throughput

What I think would really help make progress in this area is A benchmark. I filed a ticket to track this issue:

@2010YOUY01
Copy link
Contributor Author

Thank you all for the review!

@qstommyshu I agree with the implementation-level feedbacks. I will address them in the refactor.

@alamb Regarding parallel merging: I was thinking if max_spill_perge_degree configured to 10, than the memory is limited so that in each partition, we can only hold 10 batches at the same time, so parallel merging is not possible in this case.
However, @rluvaton 's PR has inspired me that, it's possible each operator is able to hold 100 batches under the memory limit at the same time, but we might still want to merge them 10 at a time for performance.

I think the next steps are

  1. Contribute benchmarks for external sort.
  2. Refactor this PR to avoid always re-spill, also do parallel merging when possible.

@2010YOUY01 2010YOUY01 marked this pull request as draft April 10, 2025 04:11
@2010YOUY01
Copy link
Contributor Author

And some other thoughts:

  1. This is a pretty complicated program, maybe we should write some unit tests to make sure it doesn't break for future modifications?

I'll try to do most of the testing and cover edge cases in integration tests at https://github.com/apache/datafusion/blob/main/datafusion/core/tests/fuzz_cases/sort_fuzz.rs and https://github.com/apache/datafusion/blob/main/datafusion/core/tests/fuzz_cases/sort_query_fuzz.rs, instead of doing extensive UTs.

I think we should promote tests to a higher level (SQL) when possible, because that API is much more stable and easier to manage. If a feature is tested mostly through unit tests, and someone later refactors the component away, those tests are likely to get lost—they might assume the feature is already covered by integration tests.

I first heard this idea in a talk by the DuckDB developers https://youtu.be/BgC79Zt2fPs?si=WiziGqJ8Dlz6-MMW

@alamb
Copy link
Contributor

alamb commented Apr 10, 2025

Yes I totally agree when possible SQL (or dataframe) is a better level to test at (and because it is the API that most users care about, not the internal details)

@github-actions github-actions bot added documentation Improvements or additions to documentation sqllogictest SQL Logic Tests (.slt) common Related to common crate execution Related to the execution crate labels Apr 12, 2025
| datafusion.execution.skip_physical_aggregate_schema_check | false | When set to true, skips verifying that the schema produced by planning the input of `LogicalPlan::Aggregate` exactly matches the schema of the input plan. When set to false, if the schema does not match exactly (including nullability and metadata), a planning error will be raised. This is used to workaround bugs in the planner that are now caught by the new schema verification step. |
| datafusion.execution.sort_spill_reservation_bytes | 10485760 | Specifies the reserved memory for each spillable sort operation to facilitate an in-memory merge. When a sort operation spills to disk, the in-memory data must be sorted and merged before being written to a file. This setting reserves a specific amount of memory for that in-memory sort/merge process. Note: This setting is irrelevant if the sort operation cannot spill (i.e., if there's no `DiskManager` configured). |
| datafusion.execution.sort_in_place_threshold_bytes | 1048576 | When sorting, below what size should data be concatenated and sorted in a single RecordBatch rather than sorted in batches and merged. |
| datafusion.execution.sort_max_spill_merge_degree | 16 | When doing external sorting, the maximum number of spilled files to read back at once. Those read files in the same merge step will be sort- preserving-merged and re-spilled, and the step will be repeated to reduce the number of spilled files in multiple passes, until a final sorted run can be produced. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great attention to detail, for updating the user guide!

@qstommyshu
Copy link
Contributor

Intended optimization

If the memory pool is enough to hold more batches at a time (while spill_max_spill_merge_degree is still limited to 4, in case the merge-degree is too large and hurt performance in some cases) One additional config sort_buffer_batch_capacity is introduced, and set to 16 in the above example, the execution will look like: ...

Thanks for the clear explanation, that's a lot of great works, and looks really cool!

@alamb
Copy link
Contributor

alamb commented Apr 13, 2025

I plan to re-review this tomorrow

/// preserving-merged and re-spilled, and the step will be repeated to reduce
/// the number of spilled files in multiple passes, until a final sorted run
/// can be produced.
pub sort_max_spill_merge_degree: usize, default = 16
Copy link
Member

@rluvaton rluvaton Apr 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a concern about this that there can still be memory issue, if the batch from each stream together is above the memory limit

I have an implementation for this that is completely memory safe and will try to create a PR for that for inspiration

The way to decide on the degree is actually by storing for each spill file the largest amount of memory a single record batch taken, and then when deciding on the degree, you simply grow until you can no longer.

The reason why I'm picky about this is that it is a new configuration that will be hard to deprecate or change

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@2010YOUY01 and @alamb I hope before you merge this PR to look at #15700 to see what I mean

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why I'm picky about this is that it is a new configuration that will be hard to deprecate or change

This is a solid point, this option is intended to be manually set, and it has to ensure (max_batch_size * per_partition_merge_degree * partition_count) < total_memory_limit. If it's set correctly for a query, then the query should succeed.
The problem is the ever-growing number of configurations in DataFusion, and it seems impossible to set them all correctly. Enabling parallel merging optimization would require introducing yet another configuration, I'm also trying to avoid that (though too-many-configs problem might be a harsh reality we must accept).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm keeping open my alternative approach as it seems working in limited memory envs as well (tested it locally with more data)

@rluvaton
Copy link
Member

rluvaton commented Apr 13, 2025

Also, to have a fully working larger than memory sort, you need to spill in

.try_grow(get_record_batch_memory_size(&batch))?;

In case the memory reservation is failing

Could you elaborate? I don't get it.

Maybe the description for #15700 might help

@2010YOUY01
Copy link
Contributor Author

Also, to have a fully working larger than memory sort, you need to spill in

.try_grow(get_record_batch_memory_size(&batch))?;

In case the memory reservation is failing

Could you elaborate? I don't get it.

Maybe the description for #15700 might help

Thank you for providing an alternative approach.

I described my primary concern in #15700 (comment), I think it is not realistic to determine a batch’s memory size after a spilling roundtrip due to the implementation complexity. In such cases, if the estimation is off by a factor of 2, the actual memory usage could also increase by a factor of 2, which is not ideal.

@rluvaton
Copy link
Member

Thank you, can you please take the fuzz test that I created in my pr and add it to yours, making sure it will pass (it will require you updating row_hash.rs file

@alamb
Copy link
Contributor

alamb commented Apr 15, 2025

Benchmark results: (I think there is no significant regression for an extra round of re-spill, if it's running on a machine with fast SSDs)

It seems to me that there is a 30% regression in performance compared to main when there is enough memory, right?

Result

Main (1.2G):
Q7 avg time: 8680.47 ms

PR (1.2G):
Q7 avg time: 11808.71 ms

But this PR is significantly better that it can complete with only 500M of memory

Is there any way to regain the performance (maybe by choosing how many merge phases to do based on available memory rather than a fixed size)?

@alamb
Copy link
Contributor

alamb commented Apr 15, 2025

Thank you, can you please take the fuzz test that I created in my pr and add it to yours, making sure it will pass (it will require you updating row_hash.rs file

@rluvaton is there any way to make a PR with only the fuzz test in it (perhaps with some comments on what would pass/fail once we have this multi-pass algorithm

@rluvaton
Copy link
Member

Thank you, can you please take the fuzz test that I created in my pr and add it to yours, making sure it will pass (it will require you updating row_hash.rs file

@rluvaton is there any way to make a PR with only the fuzz test in it (perhaps with some comments on what would pass/fail once we have this multi-pass algorithm

@alamb created #15727

@rluvaton
Copy link
Member

rluvaton commented Apr 15, 2025

tested my fuzz tests with this pr and all of them are failing currently

@2010YOUY01
Copy link
Contributor Author

Benchmark results: (I think there is no significant regression for an extra round of re-spill, if it's running on a machine with fast SSDs)

It seems to me that there is a 30% regression in performance compared to main when there is enough memory, right?

Result

Main (1.2G):
Q7 avg time: 8680.47 ms
PR (1.2G):
Q7 avg time: 11808.71 ms

But this PR is significantly better that it can complete with only 500M of memory

Is there any way to regain the performance (maybe by choosing how many merge phases to do based on available memory rather than a fixed size)?

If we manually set this max merge degree to a larger value, the merging behavior will be equivalent to the current implementation:

Q7 iteration 0 took 7242.8 ms and returned 59986052 rows
Q7 iteration 1 took 7203.4 ms and returned 59986052 rows
Q7 iteration 2 took 9812.6 ms and returned 59986052 rows
Q7 avg time: 8086.24 ms

I think auto-tuning is possible, and is also a good future optimization to do, but it requires some work to extend the memory pool to estimate available memory for current reservation.

@2010YOUY01
Copy link
Contributor Author

Thank you, can you please take the fuzz test that I created in my pr and add it to yours, making sure it will pass (it will require you updating row_hash.rs file

Those tests are great, but I think it's outside the scope of this PR: now external aggregation is using a different path for handling spills, the failures are not the regression caused by this PR.
It makes more sense to me to do it as a follow-up to: 1. Reuse spill handling code inside external aggr 2. make sure those tests pass.

@rluvaton
Copy link
Member

rluvaton commented Apr 17, 2025

In the PR that I created for fuzz tests that are also tests on sort which fails here as well

@2010YOUY01
Copy link
Contributor Author

tested my fuzz tests with this pr and all of them are failing currently

Update: I think the failure is not due to this PR's implementation, instead it's caused by FairMemoryPool's limitation.

After manually setting max_spill_merge_degree to 2, first 3 tests passed, the 4th one failed with

Error: ResourcesExhausted("Additional allocation failed with top memory consumers (across reservations) as: mock_memory_consumer#2(can spill: false) consumed 1695480 bytes, ExternalSorterMerge[0]#1(can spill: false) consumed 401664 bytes. Error: Failed to allocate additional 297024 bytes for ExternalSorterMerge[0] with 0 bytes already allocated for this reservation - 8 bytes remain available for the total pool

I believe it's the limitation of FairSpillPool that non-spillable consumers are not able to back off, and it can block spilling consumers from normal execution. (and this specific test is also possible to pass due to some complex interactions if the runtime memory consumers are set up differently)

I'll try to come up with a minimal reproducer later.

@2010YOUY01 2010YOUY01 force-pushed the cascade-merge-spill branch from 2945466 to 1895b0e Compare April 18, 2025 05:15
@LogicFan
Copy link

I've tried to use this branch to sort data larger than memory. For 24GB parquet file, it produce error Error: ArrowError(IoError("No space left on device (os error 28)", Os { code: 28, kind: StorageFull, message: "No space left on device" }), None) (i think this suggest it uses up all 100GB default DiskManager?)

here is the configuration i use

    let cfg = SessionConfig::new()
        .with_sort_max_spill_merge_degree(2)
        .with_sort_spill_reservation_bytes(1 << 10)
        .with_sort_in_place_threshold_bytes(1 << 10)
        .set_usize("datafusion.execution.batch_size", 16)
        .set_usize("datafusion.execution.soft_max_rows_per_output_file", 2048)
        .set_usize("datafusion.execution.minimum_parallel_output_files", 32);

    let memory_pool = Arc::new(TrackConsumersPool::new(
        FairSpillPool::new(16 * (1 << 30)),
        NonZeroUsize::new(5).unwrap(),
    ));

@alamb
Copy link
Contributor

alamb commented May 11, 2025

I've tried to use this branch to sort data larger than memory. For 24GB parquet file, it produce error Error: ArrowError(IoError("No space left on device (os error 28)", Os { code: 28, kind: StorageFull, message: "No space left on device" }), None) (i think this suggest it uses up all 100GB default DiskManager?)

We probably need to enable compression for such datasets (parquet is quite a bit more compressed than Arrow)

@ding-young
Copy link
Contributor

This PR has a significant strength in that it works reliably even under a fairly conservative memory limit, which is impressive. I also learned a lot while reviewing it :). However, I do have the following concern about the approach based on user-configured MAX_SPILL_MERGE_DEGREE:

I don’t think adding a user-configurable merge degree is inherently a bad idea. The problem is that the appropriate value for MAX_SPILL_MERGE_DEGREE can vary significantly depending on the query being executed and the characteristics of the spilled RecordBatches. For example, merging many thin batches is very different from merging a few wide batches in terms of memory consumption, and the optimal degree will differ accordingly.

Unless the degree is defined in a way that consistently reflects some notion of aggressiveness or is based on actual memory consumption (e.g., in bytes), it would be hard to offer meaningful tuning guidance to users. In the end, as @rluvaton suggested, I think we do need to estimate something like max_memory_bytes per batch—even if the estimate isn’t very reliable.

Though relying on memory estimation may not be very reliable, it seems unavoidable in the context of multi-pass merge to estimate the memory size of the RecordBatches to be read from each spill file. This is because we need at least a rough estimate to decide how many spill files (or streams) can be read and merged simultaneously, so that we could automatically perform multi-pass merge without user's manual debugging.

What do you think about this? 🤔

Of course, I’m also currently trying to reproduce the case you pointed out in #15700, so if I discover a significant issue there, my opinion may very well change.

@rluvaton
Copy link
Member

rluvaton commented Jul 3, 2025

Of course, I’m also currently trying to reproduce the case you pointed out in #15700, so if I discover a significant issue there, my opinion may very well change.

@ding-young I already created a pr that reproduced the issues I outlined:

@2010YOUY01
Copy link
Contributor Author

This PR has a significant strength in that it works reliably even under a fairly conservative memory limit, which is impressive. I also learned a lot while reviewing it :). However, I do have the following concern about the approach based on user-configured MAX_SPILL_MERGE_DEGREE:

I don’t think adding a user-configurable merge degree is inherently a bad idea. The problem is that the appropriate value for MAX_SPILL_MERGE_DEGREE can vary significantly depending on the query being executed and the characteristics of the spilled RecordBatches. For example, merging many thin batches is very different from merging a few wide batches in terms of memory consumption, and the optimal degree will differ accordingly.

Unless the degree is defined in a way that consistently reflects some notion of aggressiveness or is based on actual memory consumption (e.g., in bytes), it would be hard to offer meaningful tuning guidance to users. In the end, as @rluvaton suggested, I think we do need to estimate something like max_memory_bytes per batch—even if the estimate isn’t very reliable.

Though relying on memory estimation may not be very reliable, it seems unavoidable in the context of multi-pass merge to estimate the memory size of the RecordBatches to be read from each spill file. This is because we need at least a rough estimate to decide how many spill files (or streams) can be read and merged simultaneously, so that we could automatically perform multi-pass merge without user's manual debugging.

What do you think about this? 🤔

I agree this knob would be very hard to be tuned manually.

I was thinking the next step to be adding an auto option to the config, and automatically determine this max merge degree based on stats, according to avg batch size and memory budget. If it's wrong, the memory pool reports OOM, maybe the executor can retry with a small constant merge degree as the fallback path.

@2010YOUY01
Copy link
Contributor Author

tested my fuzz tests with this pr and all of them are failing currently

Update: I think the failure is not due to this PR's implementation, instead it's caused by FairMemoryPool's limitation.

After manually setting max_spill_merge_degree to 2, first 3 tests passed, the 4th one failed with

Error: ResourcesExhausted("Additional allocation failed with top memory consumers (across reservations) as: mock_memory_consumer#2(can spill: false) consumed 1695480 bytes, ExternalSorterMerge[0]#1(can spill: false) consumed 401664 bytes. Error: Failed to allocate additional 297024 bytes for ExternalSorterMerge[0] with 0 bytes already allocated for this reservation - 8 bytes remain available for the total pool

I believe it's the limitation of FairSpillPool that non-spillable consumers are not able to back off, and it can block spilling consumers from normal execution. (and this specific test is also possible to pass due to some complex interactions if the runtime memory consumers are set up differently)

I'll try to come up with a minimal reproducer later.

I spotted a critical issue in this PR:

This approach works if the memory pool is only used by a single query, since we can guarantee during different stage in sort operator (partial sort -> SPM), the memory limit for a partition is roughly the same.

However, if a memory pool is shared among many queries (new query join the pool, get finished, and leave): given a certain partition, its memory budget can vary in its life cycle. For example: for a sort operator, when doing partial sorting it has 2GB memory budget, but when it moves to SPM phase, it only got 1GB because a new memory-consuming neighbor has recently joined.

As a result, SPM has to be inherently spillable.

I plan to close this PR and help to get #15700 merged, I have a rough idea to do a patch for my previous concern, I'll share my ideas there.

cc @rluvaton @ding-young

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
common Related to common crate core Core DataFusion crate documentation Improvements or additions to documentation execution Related to the execution crate sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

A complete solution for stable and safe sort with spill
6 participants