Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add bounds checks in SDL_qsort #10066

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

aikawayataro
Copy link
Contributor

Description

Updating qsort implementation fixed only part of the non-transitive compare issue.
Using such a compare function should be considered a user code bug, but I believe it's better not to crash the whole program.
I set up a fuzzer and found a few unchecked memory reads and writes. With the proposed changes, qsort will not crash with invalid compare functions.
Fuzzer source code: fuzzer.zip

@sezero sezero requested review from icculus and slouken June 20, 2024 01:49
@madebr
Copy link
Contributor

madebr commented Jun 20, 2024

These checks are insufficient.
When adding extra tests to testqsort.c (madebr@1ea94b0), ci fails:
https://github.com/madebr/SDL/actions/runs/9597768806/job/26467538455

Feel free to use the commit in this pr to verify the fixes.

@aikawayataro
Copy link
Contributor Author

@madebr You're accessing a zero-sized allocated block at line 61 when arraylen=0 😁 there's no memory corruptions from qsort.
As for your test of non-transitive compare, this usecase, which is considered invalid and should not be tested.

@DanielGibson
Copy link
Contributor

As for your test of non-transitive compare, this usecase, which is considered invalid and should not be tested.

It's not valid, but testing it to make sure it at least doesn't crash makes sense (that's what these changes are about, after all).
But you can't expect the data to be sorted afterwards, so verifying any order doesn't make sense (even the "incorrect" order after running qsort with invalid comparator on a given array might not be deterministic in the long term, when the qsort implementation is updated again; and it might even be different depending on wordsize and whatever potentially platform-dependent special cases that implementation handles).

(No idea what exactly goes wrong in the qsort test, unless I missed something the log only mentions an incorrect exit code?)

@madebr
Copy link
Contributor

madebr commented Jun 20, 2024

@madebr You're accessing a zero-sized allocated block at line 61 when arraylen=0 😁 there's no memory corruptions from qsort. As for your test of non-transitive compare, this usecase, which is considered invalid and should not be tested.

Whoops! Good catch.
I think my changes still make sense, with a fix for the bug you noticed.

    if (arraylen > 0) {
        prev = nums[0];
    }

@aikawayataro
Copy link
Contributor Author

aikawayataro commented Jun 20, 2024

It's not valid, but testing it to make sure it at least doesn't crash makes sense (that's what these changes are about, after all).
But you can't expect the data to be sorted afterwards, so verifying any order doesn't make sense (even the "incorrect" order after running qsort with invalid comparator on a given array might not be deterministic in the long term, when the qsort implementation is updated again; and it might even be different depending on wordsize and whatever potentially platform-dependent special cases that implementation handles).

That's what I meant to say, we shouldn't test the order, just the function.

(No idea what exactly goes wrong in the qsort test, unless I missed something the log only mentions an incorrect exit code?)

Because of a bug in the test itself, there's a segfault.

I think my changes still make sense, with a fix for the bug you noticed.

Test runs just fine with your fix (there was a note about a non-existent bug I found, apologies).
Also your arraylen with invalid compare function is too big because it takes crazy amounts of cpu compared to test with valid one. I guess I will add other arraylen values that will cover the whole qsort without very large values. We should also test aligned and unaligned branches. I'll add a test with these remarks in mind.

@aikawayataro
Copy link
Contributor Author

I refactored the test, but honestly I think it does not look reasonable. The qsort used is in fact 3 qsorts for different cases. To test them all requires a lot of hackery (ignore failing build I can't figure out right pointer type for const array pointer).
What I believe is that we should not use qsort_aligned and qsort_words but use only plain unaligned version, which works in any case.

@slouken
Copy link
Collaborator

slouken commented Aug 5, 2024

What's the status of this PR? Is it something we still need?

@aikawayataro
Copy link
Contributor Author

@slouken
The current state is more like a draft. I've introduced tests to test all three implementations of sorting, but it looks crude to me.
What I propose is to get rid of two other qsort implementations and keep only qsort_r_nonaligned. It will be easier to test a single implementation, and I really don't see reason to keep qsort_r_aligned and qsort_r_words.

@slouken slouken marked this pull request as draft August 6, 2024 15:11
@slouken
Copy link
Collaborator

slouken commented Oct 6, 2024

@slouken The current state is more like a draft. I've introduced tests to test all three implementations of sorting, but it looks crude to me. What I propose is to get rid of two other qsort implementations and keep only qsort_r_nonaligned. It will be easier to test a single implementation, and I really don't see reason to keep qsort_r_aligned and qsort_r_words.

I would tend to agree, but before we do that, we should test performance in release mode on a modern processor to see if we get significant speedup from those modes.

@slouken slouken modified the milestones: 3.2.0, 3.x Oct 6, 2024
@aikawayataro
Copy link
Contributor Author

Ok, I will set up a benchmark for this

@aikawayataro
Copy link
Contributor Author

I benchmarked it and here are the results:

gcc 14.2.1 -O3
items=5000
rounds=50000
================================
qsort_r_words vs qsort_r_nonaligned for int
qsort_r_words took 171652
qsort_r_nonaligned took 179765
diff=8113, 8.113000ms
qsort_r_words is faster
================================
qsort_r_aligned vs qsort_r_nonaligned for big_struct sizeof=128
qsort_r_aligned took 213069
qsort_r_nonaligned took 202701
diff=10368, 10.368000ms
qsort_r_nonaligned is faster
gcc 14.2.1 -O2
items=5000
rounds=50000
================================
qsort_r_words vs qsort_r_nonaligned for int
qsort_r_words took 165439
qsort_r_nonaligned took 217981
diff=52542, 52.542000ms
qsort_r_words is faster
================================
qsort_r_aligned vs qsort_r_nonaligned for big_struct sizeof=128
qsort_r_aligned took 332536
qsort_r_nonaligned took 874643
diff=542107, 542.107000ms
qsort_r_aligned is faster

CPU: 12th Gen Intel(R) Core(TM) i7-12700H

Well, it makes things harder, I guess.
It can be observed that the nonaligned version performs better under O3 than the aligned version, but it is much slower under O2 (>x2 slower). words version is slightly faster than the nonaligned version in both cases.

I only compared these 2 pairs because
qsort_r_words only useful when size of item is sizeof(int) and when items buffer aligned as int
qsort_r_aligned only useful when item size is sizeof(X) % sizeof(int) == 0 and items buffer aligned as int (almost always due to padding)
qsort_r_nonaligned will always work
So it narrows everything down to these two "competing" cases.

Benchmark code bench.c.tar.gz

@slouken
Copy link
Collaborator

slouken commented Oct 8, 2024

So it sounds like it's worthwhile keeping all 3 cases. Thanks for the investigation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants