-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Endianness was incorrectly assumed for GroupWord #20
Endianness was incorrectly assumed for GroupWord #20
Conversation
@@ -29,7 +29,7 @@ impl GroupQuery { | |||
// has pretty much the same effect as a hash collision, something | |||
// that we need to deal with in any case anyway. | |||
|
|||
let group = GroupWord::from_le_bytes(*group); | |||
let group = GroupWord::from_ne_bytes(*group); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the place where the original bytes are written be changed from native endian to little endian instead? This change asserts that the serialization format is endianness dependent, when I think it should use a fixed endianness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell with some smaller tests this does not change how the hash table is serialized. That said there may be a better solution, as far as I can tell this issues comes from some assumptions of layout of arrays in memory. The most simple test that is failing is probably this one https://github.com/rust-lang/odht/blob/main/src/swisstable_group_query/mod.rs#L58..L73
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the underlying format is byte oriented, so endianness doesn't matter in the serialization format.
However, this GroupWord
is batching the u8
control words to try to efficiently scan for matches or empties. It's using the bit-oriented trailing_zeros()
to produce byte-oriented usize
indexes into the control words. It makes sense that we would want from_le_bytes
for that, to ensure low bytes are loaded at the correct "trailing" end. So I don't really understand what's happening here, why from_ne_bytes
is fixing anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha! The values eq_mask
and empty_mask
also have a to_le()
on their construction, so they're effectively flipped twice. We can either flip with from_le_bytes(..)
or these to_le()
, but should not do both.
That's interesting. Thanks for the PR and bug report, @Erk-! I'll need to take some time to actually understand what's going on here. |
I can confirm via qemu that a lot of tests fail before this change:
and all tests pass with this pull request. |
I've now tested and confirmed the fix on native s390x hardware as well. It also works if we leave the |
I finally had some time to take a closer look. I concur with @cuviper's diagnosis that this is a mismatch between working with addresses (which dependent on endianess) and bit-offsets within a I think the correct solution is to keep Once we have converted the raw bytes into a @Erk-, would you mind updating the PR accordingly? |
Once this is merged, I'll add a Miri-based regression tests (see #19). |
Thanks, @Erk-! |
@michaelwoerister will you be publishing this fix? We should get this updated in rust master and beta to avoid shipping a regression in rust-lang/rust#90123. |
…imulacrum Update odht crate to 0.3.1 (big-endian bugfix) Update `odht` to 0.3.1 in order to get rust-lang/odht#20 which fixes issue rust-lang#90123.
@cuviper, the fix was merged into rustc in rust-lang/rust#90403. |
Thanks! |
This assumption caused incorrect slots to chosen which would lead to it being unusable.
Likely also the cause for rust-lang/rust#90123, though I have not tested if it resolves it.
Tested on Linux on Z which is big-endian, I do not know if this happen on other BE systems.