-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add bounds checked get_unchecked, use it everywhere. #231
Merged
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
1368ee9
Add bounds checked get_unchecked, use it everywhere.
5225225 97a0392
Fix unchecked indexing in from_slice_with_buffers
5225225 b85e280
Add debug assertion
5225225 65d30a6
Use correct input to find structural bits
5225225 0fdcdae
Add some more debug asserts, fix misaligned reads
5225225 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to guard this behind the same
if cfg!(debug_assertions)
that is inget_kinda_unchecked
? It doesn't really seem like it's needed during runtime, just during tests to validate theget
s. OTOH, it's probably worth benchmarking if it makes a measurable difference and is worth guarding or fine to just leave in.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, because if you just write one byte, you get a use-of-uninitialized-value from memory sanitizer, and I'm sure we're reading uninit bytes into a
u64
(say, read_true_atom run on[t r u e \x00 <uninit> <uninit> <uninit>]
Simply reading uninit bytes into a
u64
is UB (see: rust-lang/unsafe-code-guidelines#71), and memory sanitizer complaining means there's definitely UB here (memory sanitizer is a very conservative check, only checking for branching on uninit bytes).Upstream seems to... just ignore the problem (they run memory sanitizer, but ignore this).
We presumably only need to ensure that
input.len() + SIMDJSON_PADDING
bytes are zero filled, so the user could re-use buffers manually (which would remove both the allocation and the zeroing). As in, the write_bytes could only be done if we're using a buffer that isn't long enough (inlen
, notcapacity
).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They ignore it because it's not a real problem, it's a "UB by decree" not because any behavior is actually undefined, the discussion you linked says this quite clearly.
There are no possible invalid combinations of bits in a u64. Rust has the tendency to be overly careful with safety, which is great, but sometimes it's worth understanding what happens under the hood - that's why unsafe exists, after all, the compiler can't always make the right call as it misses context, like that' we're not reading a u64 but a SIMD register.
But it's not worth arguing over this, criterion says the change has no significant impact on performance, some tests are marginally faster others marginally slower - it's with run on run varriance. So there is no reason to keep potentially UB behaviour just for the sake of it, that'd be silly, if we can be compiler-agreed-safe and fast that's definitely the route to take :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not commenting here to argue with you, but to note that you should be significantly more careful with uninitialized integers since rust-lang/rust#106294