Skip to content

Commit 2a4beef

Browse files
authored
Refactor resetting memory on MemoryImageSlot drop (#11510) (#11537)
* Refactor resetting memory on `MemoryImageSlot` drop This commit refactors the behavior of dropping a `MemoryImageSlot` to no longer map anonymous memory into the slot. This behavior was implemented previously because if a `MemoryImageSlot` is dropped then the state of the slot is unknown and to prevent any sort of data leakage a reset is performed. This reset operation, however, is fallible in that it calls `mmap`. Calls to `mmap` can fail due to `ENOMEM`, for example, if the process has reached its VMA limit. This means that if a process is in a near-OOM condition then failing to allocate a memory image could panic the process due to the `unwrap()` in the destructor of `MemoryImageSlot`. The purpose of this commit is to avoid this `unwrap()` and instead move the reset behavior to a location where an error can be propagated. This commit removes the clear-on-drop behavior of `MemoryImageSlot` slot. This was already disabled everywhere except the pooling allocator. The pooling allocator now maintains an extra bit of state-per-slot where instead of storing `Option<MemoryImageSlot>` it now stores effectively one other variant of "unknown". On reuse of an "unknown" slot the memory is reset back to an anonymous mapping and this is all done in a context where an error can be propagated. Two tests are added in this commit to confirm all of this behavior: * The first test is a new test that passes both before and after this commit which performs a failed allocation of a memory slot. A successful allocation is then made to ensure that the previous image is not present and zero memory is present. This test fails before the commit if the clear-on-drop behavior is removed, and it fails with this commit if the clear-on-reusing-unknown behavior is removed. Effectively this test ensures that the clear-on-unknown-state logic is present. * The second test is a new test that panicked before this commit and passes afterwards. This second test exhausts all VMAs in the current process, or at least most of them, and then tries to allocate some instances with an image. Instance allocation will eventually fail and cause the erroneous path to get executed. This previously unwrapped a `ENOMEM` failure, and now it can be handled gracefully by the embedder. * Skip the new test on QEMU, it fails on CI * Only run test on Linux
1 parent ebce5d4 commit 2a4beef

File tree

7 files changed

+282
-115
lines changed

7 files changed

+282
-115
lines changed

RELEASES.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,15 @@
1+
## 36.0.2
2+
3+
Released 2025-08-26.
4+
5+
### Fixed
6+
7+
* Wasmtime will no longer panic in the pooling allocator when in near-OOM
8+
conditions related to resetting the linear memory of a slot.
9+
[#11510](https://github.com/bytecodealliance/wasmtime/pull/11510)
10+
11+
--------------------------------------------------------------------------------
12+
113
## 36.0.1
214

315
Released 2025-08-21.

crates/wasmtime/Cargo.toml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -113,6 +113,10 @@ wasmtime-versioned-export-macros = { workspace = true, optional = true }
113113
name = "host_segfault"
114114
harness = false
115115

116+
[[test]]
117+
name = "pooling_alloc_near_oom"
118+
harness = false
119+
116120
# =============================================================================
117121
#
118122
# Features for the Wasmtime crate.

crates/wasmtime/src/runtime/vm/cow.rs

Lines changed: 1 addition & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -317,12 +317,6 @@ pub struct MemoryImageSlot {
317317
/// initial image content, as appropriate. Everything between
318318
/// `self.accessible` and `self.static_size` is inaccessible.
319319
dirty: bool,
320-
321-
/// Whether this MemoryImageSlot is responsible for mapping anonymous
322-
/// memory (to hold the reservation while overwriting mappings
323-
/// specific to this slot) in place when it is dropped. Default
324-
/// on, unless the caller knows what they are doing.
325-
clear_on_drop: bool,
326320
}
327321

328322
impl MemoryImageSlot {
@@ -345,18 +339,9 @@ impl MemoryImageSlot {
345339
accessible,
346340
image: None,
347341
dirty: false,
348-
clear_on_drop: true,
349342
}
350343
}
351344

352-
/// Inform the MemoryImageSlot that it should *not* clear the underlying
353-
/// address space when dropped. This should be used only when the
354-
/// caller will clear or reuse the address space in some other
355-
/// way.
356-
pub(crate) fn no_clear_on_drop(&mut self) {
357-
self.clear_on_drop = false;
358-
}
359-
360345
pub(crate) fn set_heap_limit(&mut self, size_bytes: usize) -> Result<()> {
361346
let size_bytes_aligned = HostAlignedByteCount::new_rounded_up(size_bytes)?;
362347
assert!(size_bytes <= self.static_size);
@@ -747,7 +732,7 @@ impl MemoryImageSlot {
747732

748733
/// Map anonymous zeroed memory across the whole slot,
749734
/// inaccessible. Used both during instantiate and during drop.
750-
fn reset_with_anon_memory(&mut self) -> Result<()> {
735+
pub(crate) fn reset_with_anon_memory(&mut self) -> Result<()> {
751736
if self.static_size == 0 {
752737
assert!(self.image.is_none());
753738
assert_eq!(self.accessible, 0);
@@ -765,45 +750,6 @@ impl MemoryImageSlot {
765750
}
766751
}
767752

768-
impl Drop for MemoryImageSlot {
769-
fn drop(&mut self) {
770-
// The MemoryImageSlot may be dropped if there is an error during
771-
// instantiation: for example, if a memory-growth limiter
772-
// disallows a guest from having a memory of a certain size,
773-
// after we've already initialized the MemoryImageSlot.
774-
//
775-
// We need to return this region of the large pool mmap to a
776-
// safe state (with no module-specific mappings). The
777-
// MemoryImageSlot will not be returned to the MemoryPool, so a new
778-
// MemoryImageSlot will be created and overwrite the mappings anyway
779-
// on the slot's next use; but for safety and to avoid
780-
// resource leaks it's better not to have stale mappings to a
781-
// possibly-otherwise-dead module's image.
782-
//
783-
// To "wipe the slate clean", let's do a mmap of anonymous
784-
// memory over the whole region, with PROT_NONE. Note that we
785-
// *can't* simply munmap, because that leaves a hole in the
786-
// middle of the pooling allocator's big memory area that some
787-
// other random mmap may swoop in and take, to be trampled
788-
// over by the next MemoryImageSlot later.
789-
//
790-
// Since we're in drop(), we can't sanely return an error if
791-
// this mmap fails. Instead though the result is unwrapped here to
792-
// trigger a panic if something goes wrong. Otherwise if this
793-
// reset-the-mapping fails then on reuse it might be possible, depending
794-
// on precisely where errors happened, that stale memory could get
795-
// leaked through.
796-
//
797-
// The exception to all of this is if the `clear_on_drop` flag
798-
// (which is set by default) is false. If so, the owner of
799-
// this MemoryImageSlot has indicated that it will clean up in some
800-
// other way.
801-
if self.clear_on_drop {
802-
self.reset_with_anon_memory().unwrap();
803-
}
804-
}
805-
}
806-
807753
#[cfg(all(test, target_os = "linux", not(miri)))]
808754
mod test {
809755
use super::*;
@@ -878,7 +824,6 @@ mod test {
878824
// Create a MemoryImageSlot on top of it
879825
let mut memfd =
880826
MemoryImageSlot::create(mmap.zero_offset(), HostAlignedByteCount::ZERO, 4 << 20);
881-
memfd.no_clear_on_drop();
882827
assert!(!memfd.is_dirty());
883828
// instantiate with 64 KiB initial size
884829
memfd.instantiate(64 << 10, None, &ty, &tunables).unwrap();
@@ -926,7 +871,6 @@ mod test {
926871
// Create a MemoryImageSlot on top of it
927872
let mut memfd =
928873
MemoryImageSlot::create(mmap.zero_offset(), HostAlignedByteCount::ZERO, 4 << 20);
929-
memfd.no_clear_on_drop();
930874
// Create an image with some data.
931875
let image = Arc::new(create_memfd_with_data(page_size, &[1, 2, 3, 4]).unwrap());
932876
// Instantiate with this image
@@ -1016,7 +960,6 @@ mod test {
1016960
let mmap = mmap_4mib_inaccessible();
1017961
let mut memfd =
1018962
MemoryImageSlot::create(mmap.zero_offset(), HostAlignedByteCount::ZERO, 4 << 20);
1019-
memfd.no_clear_on_drop();
1020963

1021964
// Test basics with the image
1022965
for image_off in [0, page_size, page_size * 2] {
@@ -1083,7 +1026,6 @@ mod test {
10831026
let mmap = mmap_4mib_inaccessible();
10841027
let mut memfd =
10851028
MemoryImageSlot::create(mmap.zero_offset(), HostAlignedByteCount::ZERO, 4 << 20);
1086-
memfd.no_clear_on_drop();
10871029
let image = Arc::new(create_memfd_with_data(page_size, &[1, 2, 3, 4]).unwrap());
10881030
let initial = 64 << 10;
10891031

crates/wasmtime/src/runtime/vm/instance/allocator/pooling/memory_pool.rs

Lines changed: 89 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,7 @@ use crate::{
6464
runtime::vm::mpk::{self, ProtectionKey, ProtectionMask},
6565
vm::HostAlignedByteCount,
6666
};
67+
use std::mem;
6768
use std::sync::atomic::{AtomicUsize, Ordering};
6869
use std::sync::{Arc, Mutex};
6970
use wasmtime_environ::{DefinedMemoryIndex, Module, Tunables};
@@ -112,7 +113,7 @@ pub struct MemoryPool {
112113

113114
/// If using a copy-on-write allocation scheme, the slot management. We
114115
/// dynamically transfer ownership of a slot to a Memory when in use.
115-
image_slots: Vec<Mutex<Option<MemoryImageSlot>>>,
116+
image_slots: Vec<Mutex<ImageSlot>>,
116117

117118
/// A description of the various memory sizes used in allocating the
118119
/// `mapping` slab.
@@ -139,6 +140,37 @@ pub struct MemoryPool {
139140
next_available_pkey: AtomicUsize,
140141
}
141142

143+
/// The state of memory for each slot in this pool.
144+
#[derive(Debug)]
145+
enum ImageSlot {
146+
/// This slot is guaranteed to be entirely unmapped.
147+
///
148+
/// This is the initial state of all slots.
149+
Unmapped,
150+
151+
/// The state of this slot is unknown.
152+
///
153+
/// This encompasses a number of situations such as:
154+
///
155+
/// * The slot is currently in use.
156+
/// * The slot was attempted to be in use, but allocation failed.
157+
/// * The slot was used but not deallocated properly.
158+
///
159+
/// All of these situations are lumped into this one variant indicating
160+
/// that, at a base level, no knowledge is known about this slot. Using a
161+
/// slot in this state first requires resetting all memory in this slot by
162+
/// mapping anonymous memory on top of the entire slot.
163+
Unknown,
164+
165+
/// This slot was previously used and `MemoryImageSlot` maintains the state
166+
/// about what this slot was last configured as.
167+
///
168+
/// Future use of this slot will use `MemoryImageSlot` to continue to
169+
/// re-instantiate and reuse images and such. This state is entered after
170+
/// and allocated slot is successfully deallcoated.
171+
PreviouslyUsed(MemoryImageSlot),
172+
}
173+
142174
impl MemoryPool {
143175
/// Create a new `MemoryPool`.
144176
pub fn new(config: &PoolingInstanceAllocatorConfig, tunables: &Tunables) -> Result<Self> {
@@ -219,7 +251,7 @@ impl MemoryPool {
219251
);
220252
}
221253

222-
let image_slots: Vec<_> = std::iter::repeat_with(|| Mutex::new(None))
254+
let image_slots: Vec<_> = std::iter::repeat_with(|| Mutex::new(ImageSlot::Unmapped))
223255
.take(constraints.num_slots)
224256
.collect();
225257

@@ -360,7 +392,7 @@ impl MemoryPool {
360392
let base = self.get_base(allocation_index);
361393
let base_capacity = self.layout.max_memory_bytes;
362394

363-
let mut slot = self.take_memory_image_slot(allocation_index);
395+
let mut slot = self.take_memory_image_slot(allocation_index)?;
364396
let image = match memory_index {
365397
Some(memory_index) => request.runtime_info.memory_image(memory_index)?,
366398
None => None,
@@ -459,13 +491,18 @@ impl MemoryPool {
459491
.allocator
460492
.alloc_affine_and_clear_affinity(module, memory_index)
461493
{
462-
// Clear the image from the slot and, if successful, return it back
463-
// to our state. Note that on failure here the whole slot will get
464-
// paved over with an anonymous mapping.
494+
// Attempt to acquire the `MemoryImageSlot` state for this
495+
// slot, and then if we have that try to remove the image,
496+
// and then if all that succeeds put the slot back in.
497+
//
498+
// If anything fails then the slot will be in an "unknown"
499+
// state which means that on next use it'll be remapped with
500+
// anonymous memory.
465501
let index = MemoryAllocationIndex(id.0);
466-
let mut slot = self.take_memory_image_slot(index);
467-
if slot.remove_image().is_ok() {
468-
self.return_memory_image_slot(index, slot);
502+
if let Ok(mut slot) = self.take_memory_image_slot(index) {
503+
if slot.remove_image().is_ok() {
504+
self.return_memory_image_slot(index, slot);
505+
}
469506
}
470507

471508
stripe.allocator.free(id);
@@ -485,21 +522,49 @@ impl MemoryPool {
485522
self.mapping.offset(offset).expect("offset is in bounds")
486523
}
487524

488-
/// Take ownership of the given image slot. Must be returned via
489-
/// `return_memory_image_slot` when the instance is done using it.
490-
fn take_memory_image_slot(&self, allocation_index: MemoryAllocationIndex) -> MemoryImageSlot {
491-
let maybe_slot = self.image_slots[allocation_index.index()]
492-
.lock()
493-
.unwrap()
494-
.take();
495-
496-
maybe_slot.unwrap_or_else(|| {
525+
/// Take ownership of the given image slot.
526+
///
527+
/// This method is used when a `MemoryAllocationIndex` has been allocated
528+
/// and the state of the slot needs to be acquired. This will lazily
529+
/// allocate a `MemoryImageSlot` which describes the current (and possibly
530+
/// prior) state of the slot.
531+
///
532+
/// During deallocation this structure is passed back to
533+
/// `return_memory_image_slot`.
534+
///
535+
/// Note that this is a fallible method because using a slot might require
536+
/// resetting the memory that was previously there. This reset operation
537+
/// is a fallible operation that may not succeed. If it fails then this
538+
/// slot cannot be used at this time.
539+
fn take_memory_image_slot(
540+
&self,
541+
allocation_index: MemoryAllocationIndex,
542+
) -> Result<MemoryImageSlot> {
543+
let (maybe_slot, needs_reset) = {
544+
let mut slot = self.image_slots[allocation_index.index()].lock().unwrap();
545+
match mem::replace(&mut *slot, ImageSlot::Unknown) {
546+
ImageSlot::Unmapped => (None, false),
547+
ImageSlot::Unknown => (None, true),
548+
ImageSlot::PreviouslyUsed(state) => (Some(state), false),
549+
}
550+
};
551+
let mut slot = maybe_slot.unwrap_or_else(|| {
497552
MemoryImageSlot::create(
498553
self.get_base(allocation_index),
499554
HostAlignedByteCount::ZERO,
500555
self.layout.max_memory_bytes.byte_count(),
501556
)
502-
})
557+
});
558+
559+
// For `Unknown` slots it means that `slot` is brand new and isn't
560+
// actually tracking the state of the previous slot, so reset it
561+
// entirely with anonymous memory to wipe the slate clean and start
562+
// from zero. This should only happen if allocation of the previous
563+
// slot failed, for example.
564+
if needs_reset {
565+
slot.reset_with_anon_memory()?;
566+
}
567+
Ok(slot)
503568
}
504569

505570
/// Return ownership of the given image slot.
@@ -509,21 +574,12 @@ impl MemoryPool {
509574
slot: MemoryImageSlot,
510575
) {
511576
assert!(!slot.is_dirty());
512-
*self.image_slots[allocation_index.index()].lock().unwrap() = Some(slot);
513-
}
514-
}
515577

516-
impl Drop for MemoryPool {
517-
fn drop(&mut self) {
518-
// Clear the `clear_no_drop` flag (i.e., ask to *not* clear on
519-
// drop) for all slots, and then drop them here. This is
520-
// valid because the one `Mmap` that covers the whole region
521-
// can just do its one munmap.
522-
for mut slot in std::mem::take(&mut self.image_slots) {
523-
if let Some(slot) = slot.get_mut().unwrap() {
524-
slot.no_clear_on_drop();
525-
}
526-
}
578+
let prev = mem::replace(
579+
&mut *self.image_slots[allocation_index.index()].lock().unwrap(),
580+
ImageSlot::PreviouslyUsed(slot),
581+
);
582+
assert!(matches!(prev, ImageSlot::Unknown));
527583
}
528584
}
529585

crates/wasmtime/src/runtime/vm/memory.rs

Lines changed: 0 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -547,29 +547,6 @@ impl LocalMemory {
547547

548548
let mut slot =
549549
MemoryImageSlot::create(mmap_base, byte_size, alloc.byte_capacity());
550-
// On drop, we will unmap our mmap'd range that this slot
551-
// was mapped on top of, so there is no need for the slot to
552-
// wipe it with an anonymous mapping first.
553-
//
554-
// Note that this code would be incorrect if clear-on-drop
555-
// were enabled. That's because:
556-
//
557-
// * In the struct definition, `memory_image` above is listed
558-
// after `alloc`.
559-
// * Rust drops fields in the order they're defined, so
560-
// `memory_image` would be dropped after `alloc`.
561-
// * `alloc` can represent either owned memory (i.e. the mmap is
562-
// freed on drop) or logically borrowed memory (something else
563-
// manages the mmap).
564-
// * If `alloc` is borrowed memory, then this isn't an issue.
565-
// * But if `alloc` is owned memory, then it would first drop
566-
// the mmap, and then `memory_image` would try to remap
567-
// part of that same memory as part of clear-on-drop.
568-
//
569-
// A lot of this really suggests representing the ownership
570-
// via Rust lifetimes -- that would be a major refactor,
571-
// though.
572-
slot.no_clear_on_drop();
573550
slot.instantiate(alloc.byte_size(), Some(image), ty, tunables)?;
574551
Some(slot)
575552
} else {

0 commit comments

Comments
 (0)