You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
That call does not require the pointer to be aligned at all. But it could be replaced with _mm_load_si128, which requires alignment to a 16-byte boundary.
On line 139, the ptr is aligned to the VECTOR_ALIGN mask, which aligns it to size_of::<__m128i>(), or to 16 bytes.
Hmmm I think actually there's probably a bug here. We probably should be making use of unaligned loads here to avoid the slow byte-at-a-time loop. Like this:
(Although that interestingly does use an unaligned load just before when it could be an aligned load.)
Aligned versus unaligned rarely makes a difference on x86-64 in my experience anyway, and especially here where this is just the "cleanup" aspect of the implementation.
But good catch. I agree that as the code is currently written, it can be safely an aligned load.
On line 186, there is a call to
_mm_loadu_si128
:bstr/src/ascii.rs
Line 186 in 3dc5939
That call does not require the pointer to be aligned at all. But it could be replaced with
_mm_load_si128
, which requires alignment to a 16-byte boundary.On line 139, the ptr is aligned to the
VECTOR_ALIGN
mask, which aligns it tosize_of::<__m128i>()
, or to 16 bytes.bstr/src/ascii.rs
Lines 120 to 121 in 3dc5939
https://github.com/BurntSushi/bstr/blob/3dc5939f30daa1a8a6e5cc346bb77841f19ea415/src/ascii.rs#L139C9-L139C12
The
ptr
is always advanced byVECTOR_LOOP_SIZE
in a loop, which is a multiple of 16 bytes:bstr/src/ascii.rs
Lines 120 to 122 in 3dc5939
https://github.com/BurntSushi/bstr/blob/3dc5939f30daa1a8a6e5cc346bb77841f19ea415/src/ascii.rs#L180C36-L180C52
And then further advanced by
VECTOR_SIZE
which is 16 bytes:bstr/src/ascii.rs
Line 120 in 3dc5939
bstr/src/ascii.rs
Line 191 in 3dc5939
So in the loop at L186, the pointer is always aligned to 16 bytes:
bstr/src/ascii.rs
Line 186 in 3dc5939
The aligned version of the function was pointed out by @anforowicz during unsafe code audit: https://chromium-review.googlesource.com/c/chromium/src/+/5925797/comment/f08dc00c_1b24061c/
The text was updated successfully, but these errors were encountered: