-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-9360][SQL] Support BinaryType in PrefixComparators for UnsafeExternalSort #7676
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #38475 has finished for PR 7676 at commit
|
|
thanks, I'll fix it. |
b9827d6 to
ecf3ac5
Compare
|
retest this please |
|
Test build #197 has finished for PR 7676 at commit
|
|
Test build #39667 has finished for PR 7676 at commit
|
|
Test build #39663 has finished for PR 7676 at commit
|
|
Test build #39698 has finished for PR 7676 at commit
|
|
retest this please |
|
Test build #209 has finished for PR 7676 at commit
|
|
Test build #39718 has finished for PR 7676 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we get a word directly, similar to how we do it in utf8string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is hard to do that.
This byte access is needed to map signed bytes to unsigned ones, and
this mapping can make UnsignedLongs#compare compare them in a order-preserving way according to TypeUtils#compareBinary.
If we have direct word access here, BinaryPrefixComparator#compare needs to inefficiently compare them by using TypeUtils#compareBinary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use 1 as the initial value? I think use 0 could be better.
for (int i=0; i<minLen; i++) {
p <<= 8;
p |= 128L + UNSAFE.getByte(bytes, BYTE_ARRAY_OFFSET + i)
}
p <<= (8 - minLen) * 8;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
|
@rxin The tests failed though, ISTM the |
|
cc @davies for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't need the cast here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are not used here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, they were: they're needed for BYTE_ARRAY_OFFSET.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This broke the build, so I'm hotfixing now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, just missed that, thanks!
|
LGTM, merging this into master and 1.5. |
…ExternalSort The current implementation of UnsafeExternalSort uses NoOpPrefixComparator for binary-typed data. So, we need to add BinaryPrefixComparator in PrefixComparators. Author: Takeshi YAMAMURO <linguin.m.s@gmail.com> Closes #7676 from maropu/BinaryTypePrefixComparator and squashes the following commits: fe6f31b [Takeshi YAMAMURO] Apply comments d943c04 [Takeshi YAMAMURO] Add a codegen'd entry for BinaryType in SortPrefix ecf3ac5 [Takeshi YAMAMURO] Support BinaryType in PrefixComparator (cherry picked from commit 6d8a6e4) Signed-off-by: Davies Liu <davies.liu@gmail.com>
|
Test build #39842 has finished for PR 7676 at commit
|
The current implementation of UnsafeExternalSort uses NoOpPrefixComparator for binary-typed data.
So, we need to add BinaryPrefixComparator in PrefixComparators.