-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposed fix to enforce byte alignment for static_array_backend feature #86
Conversation
Edit: fixing the imp_windows.rs |
Thanks, I'll take a look at this within the next couple days. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the in depth analysis @nand-nor! That really helps, and I appreciate the effort that went into it.
I have some inline comments below suggesting some improvements and also some questions. Thanks!
wee_alloc/src/imp_static_array.rs
Outdated
static mut SCRATCH_HEAP: [u8; SCRATCH_LEN_BYTES] = [0; SCRATCH_LEN_BYTES]; | ||
static mut OFFSET: Mutex<usize> = Mutex::new(0); | ||
|
||
pub(crate) unsafe fn alloc_pages(pages: Pages) -> Result<NonNull<u8>, AllocErr> { | ||
pub(crate) unsafe fn alloc_pages<B: Into<Bytes>>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should take Bytes
directly, rather than B: Into<Bytes>
. We don't want multiple, monomorphized copies of this function, so that we can keep code size small.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in upcoming push, I see that the into()
capability is superfluous compared to accessing align.0
.
wee_alloc/src/imp_static_array.rs
Outdated
loop { | ||
*offset += 1; | ||
end += 1; | ||
ptr = SCRATCH_HEAP[*offset..end].as_mut_ptr() as *mut u8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is anything here that is ensuring that offset
and end
are still in bounds of SCRATCH_HEAP
as they are pushed forward while we are searching for the alignment? That could lead to out-of-bounds accesses and memory unsafety.
Additionally, we don't need to do a one-byte-at-a-time loop while searching for alignment. Instead, we can round up to the next aligned offset with:
fn round_up_to_alignment(n: usize, align: usize) -> usize {
extra_assert!(align > 0);
extra_assert!(align.is_power_of_two());
(n + align - 1) & !(align - 1)
}
and then check that the aligned version is within bounds:
let scratch_heap_start = SCRATCH_HEAP.as_mut_ptr() as usize;
let scratch_heap_end = scratch_heap_start + SCRATCH_HEAP.len();
let unaligned_ptr = scratch_heap_start + *offset;
let aligned_ptr = round_up_to_alignment(unaligned_ptr, align);
// NB: `unaligned_ptr <= aligned_ptr` handles potential overflow in `round_up_to_alignment`.
if unaligned_ptr <= aligned_ptr && aligned_ptr < scratch_heap_end
&& aligned_ptr.checked_add(bytes.0).ok_or(AllocErr)? < scratch_heap_end
{
// success!
} else {
// error!
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah great catch, I foolishly got rid of an additional loop where end
was incremented to be byte aligned before the end < SCRATCH_HEAP_LEN
check in an attempt to make it cleaner. That was a mistake and this looks much nicer. Will add this better fix to the next push in just a moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This specific change causes a large number of tests to fail. After confirming that it was not the alignment enforcement change of Into trait removal, I made some small adjustments. Tests are now all passing (including tests on the system where I encountered the error) and I have now pushed. I have pasted an edited-for-brevity test log output below to show where the failures were emitted.
+ export WEE_ALLOC_STATIC_ARRAY_BACKEND_BYTES=1073741824
+ WEE_ALLOC_STATIC_ARRAY_BACKEND_BYTES=1073741824
+ cargo test --release --features 'static_array_backend extra_assertions size_classes'
Compiling...
running 21 tests
test allocate_size_zero ... ok
...
failures:
allocate_many_large
allocate_many_small
multi_threaded_quickchecks
quickchecks_0
quickchecks_1
regression_test_0
regression_test_1
regression_test_2
regression_test_3
single_allocation_with_size_and_align
smoke
test_trace_cpp_demangle
test_trace_dogfood
test_trace_ffmpeg
test_trace_find
test_trace_gcc_hello
test_trace_grep_random_data
test_trace_grep_recursive
test_trace_ls
test_trace_source_map
test result: FAILED. 1 passed; 20 failed; 0 ignored; 0 measured; 0 filtered out
wee_alloc/src/imp_static_array.rs
Outdated
const SCRATCH_LEN_BYTES: usize = include!(concat!( | ||
env!("OUT_DIR"), | ||
"/wee_alloc_static_array_backend_size_bytes.txt" | ||
)); | ||
static mut SCRATCH_HEAP: [u8; SCRATCH_LEN_BYTES] = [0; SCRATCH_LEN_BYTES]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thing we should do to avoid most scenarios where we don't have proper alignment is to force an alignment on the SCRATCH_HEAP
via
#[repr(align(4096)]
struct ScratchHeap([u8; SCRATCH_LEN_BYTES] = [0; SCRATCH_LEN_BYTES]);
static mut SCRATCH_HEAP: ScratchHeap = ScratchHeap([0; SCRATCH_LEN_BYTES]);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, is just this change enough to completely fix the problem, since the only call to alloc_pages
ensures that the number of pages is guaranteed to be big enough to contain a cell of the requested alignment? Which extra assertion in particular is failing right now? Is it assert_is_word_aligned
? If so, then I think just this change is good enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added this as well to the upcoming push, and it does seem to be the ultimate fix but I have one possibly unfounded reservation. Once I implemented the above rough alignment fix no extra assertions failed. Prior to the fix the extra_assertion feature forced failure at around the first allocation attempt, I believe somewhere in alloc_first_fit
in the call to
assert_aligned_to(allocated.data(), align);
Looking at assert_is_word_aligned
it is a wrapper for assert_aligned_to
so yes potentially just this change is good enough.
However, I had instrumented your code to speed up problem source discovery, and I recall noticing that somewhere in the allocation process, the resulting size request was not always a multiple of the byte alignment requirement. I (also foolishly) reverted all of that instrumentation to submit a nice clean PR and do not exactly recall where, so potentially this is not an issue. But for instance it may have been when, in new_cell_for_free_list
, the size of the requested allocation is compared to and changed based on whether the value ensures the minimum cell size is left over after allocation (e.g. in new_cell_for_free_list
in lib.rs:780 let size: Bytes = cmp::max(size.into(), (align + Self::MIN_CELL_SIZE) * Words(2));
). Edit: the size I observed over many allocations may have been in pages rather than bytes, so again this may be a non-issue.
Im not sure without instrumenting the code again, but is it possible that the end resulting size request may result in *offset
becoming unaligned so that the next allocation will be unaligned even if the SCRATCH_HEAP
is forced to align as a page boundary? If this is possible then Im not sure just this fix would solve the issue. I can look into this at some point if you would like?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code should be ensuring that the requested number of pages allocated should always be able to satisfy the allocation request, if alloc_pages
succeeds: https://github.com/rustwasm/wee_alloc/blob/master/wee_alloc/src/lib.rs#L776-L783
The allocation doesn't have to be a multiple of the alignment, just has to be big enough that an aligned chunk of the requested size must exist within it somewhere.
…ll push and attempt to resolve
…nction for enforcing byte alignment. All tests now passing. Additionally, build.sh and check.sh all complete
FYI: I've split out just the alignment of the scratch heap into #87 |
Great! That was what I was hoping would happen, but I didn't want to close this out until you confirmed that :)
Yep, I think we can close this PR. If you run into any thing else, please file an issue! |
Problem statement
The
wee_alloc
implementation of thestatic_array_backend
feature erroneously causes an out-of-memory error. The context where the problem arises is in compilation for a bare-metal embedded system (OS independent, target is something likearmv7-unknown-linux-gnueabihf
), using thestatic_array_backend
feature and assigningwee_alloc
as the global allocator.The root cause of the resulting error is the implementation of
imp_static_array::alloc_pages()
. The function creates a pointer to a slice of theSCRATCH_HEAP
of the requested size, which is used to create theFreeCell
objects that compose the free list. The function does not enforce that the slice pointer maintain any byte boundary alignment, resulting in unalignedFreeCell
objects. This ultimately creates an edge case error where allocation fails despite adequate memory left in theSCRATCH_HEAP
. The error occurs regardless of the configured static backend array size exported at compile time. Additionally, the failure arises regardless of size policy (e.g. whether or not thesize_classes
feature is enabled).Detailed description of the control flow in which the problem arises
The problem is immediately detectable by enabling the
extra_assertions
feature. With this feature enabled, the first call to allocate memory from the free list results in an alignment assertion failure, because as described above, theFreeCell
objects created from theSCRATCH_HEAP
object do not have the proper alignment. However, whenextra_assertions
feature is not enabled, the edge case in the example system occurs when the first 256 Kb of the free list are exhausted a.k.a. a full walk of the initial free list results inalloc_with_refill()
executing the code branch to create and insert a newFreeCell
.Upon an allocation request, the allocator attempts to allocate from the
SCRATCH_HEAP
. This is done via call to theGlobalAlloc
trait’s implementation ofalloc()
, which callsalloc_impl()
, which then callsalloc_with_refill()
within a function closure passed toself.with_free_list_and_policy_for_size
. Thealloc_with_refill
function first callsalloc_first_fit()
which walks the free list to find aFreeCell
of adequate size. After the 256 Kb exhaustion point,alloc_first_fit
fails to find aFreeCell
of adequate size. At this point, the branch for creating and inserting a newFreeCell
is executed via calls tonew_cell_for_free_list()
andinsert_into_free_list()
. Control flow then returns toalloc_with_refill
, which callsalloc_first_fit
again, which in turn callstry_alloc()
on the newly insertedFreeCell
object, which is the point of failure. The checks intry_alloc()
assert that the newly insertedFreeCell
is large enough to hold the requested allocation size, but that it is not big enough to be split in two. This leads to the final check intry_alloc()
, which determines if theFreeCell
is appropriately aligned, which, because of howalloc_pages()
is implemented for theimpl_static_array
mod, results in a failure to allocate.Steps to reproduce problem
Proposed problem fix
To address this error, the
imp_static_array::alloc_pages()
function should enforce alignment requirements such that pointers to slices of theSCRATCH_HEAP
are always aligned to the needed byte boundary. The resultingFreeCell
objects will therefore always meet alignment checks, whether or not theextra_assertions
feature is enabled.