-
Notifications
You must be signed in to change notification settings - Fork 4k
GH-49104: [C++] Fix Segfault in SparseCSFIndex::Equals with mismatched dimensions #49105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
GH-49104: [C++] Fix Segfault in SparseCSFIndex::Equals with mismatched dimensions #49105
Conversation
|
Thanks for opening a pull request! If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project. Then could you also rename the pull request title in the following format? or See also: |
|
|
|
Could you add a test for this case? |
d60ea08 to
3e1cbd6
Compare
|
@kou On your request, I have added a new test case, Test details: Fix summary: |
|
Could you fix the lint failure? |
…imit and style guide
|
I have fixed the lint failures by reformatting SparseCSFIndex::Equals() to comply with the 90-character line limit and Arrow's style guide. All functionality remains unchanged. You can have a check on it ): |
The TEST(TestSparseCSFIndex, EqualsMismatchedDimensions) test created SparseCSFIndex objects with empty tensors (nullptr buffers, 0-length shape), causing segfaults during validation on ASAN/UBSAN and 'front() called on empty vector' errors on MSVC. The typed test TestEqualityMismatchedDimensions already properly validates the fix with valid CSF index structures.
|
Note: Some packaging/JNI tests are failing due to Docker image naming with my fork. The core C++ tests should be passing. |
Rationale for This Change
The
SparseCSFIndex::Equalsmethod can crash when comparing two sparse indices that have a different number of dimensions. The method iterates over theindices()andindptr()vectors of the current object and accesses the corresponding elements in theotherobject without first verifying that both objects have matching vector sizes. This can lead to out-of-bounds access and a segmentation fault when the dimension counts differ.What Changes Are Included in This PR?
This change adds explicit size equality checks for the
indices()andindptr()vectors at the beginning of theSparseCSFIndex::Equalsmethod. If the dimensions do not match, the method now safely returnsfalseinstead of attempting invalid memory access.Are These Changes Tested?
Yes. The fix has been validated through targeted reproduction of the crash scenario using mismatched dimension counts, ensuring the method behaves safely and deterministically.
Are There Any User-Facing Changes?
No. This change improves internal safety and robustness without altering public APIs or observable user behavior.