Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quadratic-time run scanning #8

Open
mlochbaum opened this issue Feb 19, 2023 · 4 comments
Open

Quadratic-time run scanning #8

mlochbaum opened this issue Feb 19, 2023 · 4 comments

Comments

@mlochbaum
Copy link

mlochbaum commented Feb 19, 2023

The minimum run length commit seems to introduce quadratic behavior for runs somewhat shorter than sqrt(n / 2), because the run is repeatedly followed and discarded. If so, this would cause worst-case performance of O(n^(3/2)) by multiplying O(n) time to get past each run by O(sqrt(n)) runs that fit in a length-n array. I took the following timings on a length 1e8 array to confirm that this has a practical impact; the input data is just 0, 1, ... r-1 repeated, for run length r. sqrt(1e8 / 2) is about 7071; strangely, performance improves gradually from about 6000 to that number instead of sharply as I'd expected. The "% create" here is a loose estimate from perf top of fraction of time spent in LogicalRun<B,T>::create, and "Time create" is that multiplied by total time.

Run Time (s) % create Time create
500 2.44 0.20 0.49
1000 2.96 0.25 0.74
2000 3.74 0.38 1.42
4000 4.48 0.45 2.02
@orlp
Copy link
Owner

orlp commented Feb 19, 2023

Ah, I forgot to change the skipping logic in that commit.

If I reject the run I should still skip all those elements, instead of just SMALL_SORT elements. Will fix soon, thanks for reporting.

@mlochbaum
Copy link
Author

mlochbaum commented Feb 19, 2023

Maybe it would be better to do this filtering in logical_merge? That is, if one of the halves of an unsorted/not-unsorted pair has length less than the threshold, it should be treated as unsorted. To match current behavior better you'd also avoid doing a physical merge if the result will be shorter than the threshold, but I'd think that behavior is actually unwanted, or at least the threshold is too high.

@orlp
Copy link
Owner

orlp commented Feb 20, 2023

Hmm, yes, actually I think that's better in general.

@dumblob
Copy link

dumblob commented Jun 21, 2023

It seems the patch did not make it into master yet. Any plans to do so? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants