-
Notifications
You must be signed in to change notification settings - Fork 821
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix memory leak in host function envs #1865
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A function can be exported twice. How do you verify that host_env_drop_fn
is only called once?
bors try |
tryBuild failed: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the PR. In some places it simplifies code, in other, it's a little bit harder to understand. I think another refactoring round would be required in a close future, but let's keep the priority as is: fixing the memory leak.
There is a lot of TODO
in this PR. Are you going to fix them all? I believe a majority should be fixed before merging this PR.
func: Box::new(func), | ||
function_type: ty.clone(), | ||
}); | ||
let dynamic_ctx: VMDynamicFunctionContext<DynamicFunctionWithoutEnv> = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I bet this type annotation isn't required (regardless, I prefer having type annotation in this kind of code, so it's perfectly fine to keep it, just wondered whether it changes something).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not strictly no, I thought it was helpful for understanding what's happening with the unsafe block here though.
Actually all these wrapper functions can be created by a helper function, so I should probably just do that
// This field _must_ come first in this struct. | ||
env: Box<Env>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why env
is no longer required to be at first position?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason before was due to a bad hack where we'd double deref a pointer ignoring the fact that it was in a struct.
We now have each function that is called on the env
traverse things correctly, so for example we create a wrapper function where we walk through the types and find what we need. So if we need to operate on the Box<env>
itself, we can find it in each function. I think this is a simpler solution because then all the functions always just work on the env
, we had an issue where the clone and drop function needed to be called on the top level the wrapper dynamic func context thing but the initialize function needed to be called on the inner env. This fixes that by making the thing each function gets consistent.
@@ -76,7 +76,7 @@ impl ValFuncRef for Val { | |||
let export = wasmer_engine::ExportFunction { | |||
// TODO: | |||
// figure out if we ever need a value here: need testing with complicated import patterns | |||
import_init_function_ptr: None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the TODO
above still valid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm not 100% confident in this code which is why that comment is still there, I can investigate it later... I'm not aware of a good way to test this yet, I can investigate that though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should investigate. I bet is fine as None, but it will be good to verify
lib/engine/src/export.rs
Outdated
/// | ||
/// See `wasmer_vm::export::VMExportFunction::vmctx` for the version of | ||
/// this pointer that is used by the VM when creating an `Instance`. | ||
pub host_env: *mut std::ffi::c_void, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idea: Create a new
constructor method rather than having all fields pub
. I think it could be “dangerous”, it's better to hide everything here. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for all the fields in this type.
7c247bc
to
ec4d6e6
Compare
1902: Fix memory leak in `InstanceAllocator` on failure r=MarkMcCaskey a=MarkMcCaskey This code needs to be cleaned up, just hacked it out really quickly. The idea is to use a builder pattern to create a type-safe state machine for setting up things for the `InstanceAllocator`; this should allow us to remove `unsafe` in our "public" API because we can guarantee certain invariants because of the restricted nature of these types. The names for the new types are long and don't really make sense but I didn't want to get blocked on that: suggestions welcome! In combination with #1865 , this PR should fix the all the memory leaks seen by asan while running `make test-cranelift-jit` 🎉 🎉 🎉 Co-authored-by: Mark McCaskey <mark@wasmer.io>
impl<'a> WasmerEnv for &'a ::std::sync::atomic::AtomicU32 {} | ||
impl<'a> WasmerEnv for &'a ::std::sync::atomic::AtomicI64 {} | ||
impl<'a> WasmerEnv for &'a ::std::sync::atomic::AtomicUsize {} | ||
impl<'a> WasmerEnv for &'a ::std::sync::atomic::AtomicIsize {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need the <'a>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<'a>
is needed because of the &
: trait impls on things with lifetimes can't omit them. And the reason we need &
is to make these types shared somehow, otherwise the impl WasmerEnv for X
can't be used here unless we have recursive definitions.
These impl
s on std types definitely aren't critical; these actually are only work with 'static
lifetimes because of other constraints, so this only really works when putting an atomic in a static variable or otherwise doing unsafe things.
I'd be fine to remove them
@@ -76,7 +76,7 @@ impl ValFuncRef for Val { | |||
let export = wasmer_engine::ExportFunction { | |||
// TODO: | |||
// figure out if we ever need a value here: need testing with complicated import patterns | |||
import_init_function_ptr: None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should investigate. I bet is fine as None, but it will be good to verify
pub enum ImportFunctionEnv { | ||
/// The `vmctx` pointer does not refer to a host env, there is no | ||
/// metadata about it. | ||
NoEnv, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think is better to name this: VMEnv
.
And have the HostEnv
to be an enum with all the different options:
- NoEnv
- Env
Also, the VM should not be aware of dynamic functions or not. So I believe DynamicHostEnvWithNoInnerEnv
should be done via HostEnv
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thought on the VM
prefix, I'm not sure it makes sense to use it on types that aren't related to the VM.
Yeah, the extra case here is just to make the multiple cases more explicit. Otherwise there's data that's sometimes there and sometimes not and it's not exactly clear why. This enum is more descriptive than prescriptive: it's describing the state of the world as it is, not what it should be. Perhaps we should move this type out of vm
, there's no reason it has to be defined in VM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the VM, there is not such thing as DynamicHostEnv. This is something that we do in the engine level, not VM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meaning: the vm
only handles the functions with their end ABI. Is the engine the one that creates the trampolines and so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: there is some "dangling" code in the VM that is aware of dynamic functions. But thanks to the PR where we move most of the function stuff to the engine, it should be easy to move the rest as well.
bors r+ |
1865: Fix memory leak in host function envs r=MarkMcCaskey a=MarkMcCaskey TODO: link to issue This PR contains a number of changes: 1. Make `WasmerEnv: Clone` 2. Store a pointer to the `clone` function when creating a host function (Notably this is a feature that wouldn't work even if we _could_ use a proper trait object because you can't have a `Sized` trait object and `Clone: Sized`). 3. Store a pointer to the `drop` function when creating a host function. 4. Clone the env via pointer every time an `Instance` is made. Therefore each `Instance` gets its own, unique `Env` per host function with `Env`. 5. Add reference counting and drop logic to a sub-field of `wasmer_export::ExportFunction` which frees the original version of the `Env` (the thing that gets cloned each time an `Instance` is made) with the `drop` function pointer. 6. Change some logic in `vm::Instance` from SoA (struct of arrays) to AoS (array of structs): this uses more memory but is a bit less error prone and can be easily changed later. 7. Add logic on this new struct (`vm::ImportEnv`) that contains the function pointers for each import in `Instance` to drop (with the `drop` fn pointer) when the `vm::Instance` is being dropped. This fixes the original memory leak. 8. Add wrapper functions inside the host function creation functions which makes the layout of the user supplied env-pointer the responsibility of each function. Thus, rather than `drop` being `Env::drop`, it's a function which frees all wrapper types, traverses indirections and frees the internal `Env` with `Env::drop`. This simplifies code at the cost of making the `host_env` pointer (`vmctx`) not consistent in terms of what it actually points to. This change fixes another memory leak related to the creation of host functions. tl;dr: we're leaning into manually doing virtual method dispatch on `WasmerEnv`s and it actually works great! The biggest issue I have with the PR as-is is that the code isn't as clean/readable/robust as I'd ideally like it to be. Edit (by @Hywan): This PR fixes #1584, #1714, #1865, #1667. # Review - [ ] Add a short description of the the change to the CHANGELOG.md file Co-authored-by: Mark McCaskey <mark@wasmer.io> Co-authored-by: Mark McCaskey <5770194+MarkMcCaskey@users.noreply.github.com>
Build failed: |
counter: Arc<RefCell<i32>>, | ||
counter: Arc<Mutex<i32>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to change the examples? The previous examples should work as is. Or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the previous examples were incorrect, we just didn't enforce it. WasmerEnv
needs to be Send
and Sync
because Instance
is Send
and Sync
and we don't synchronize access to functions. So it's possible in safe Rust to get a NativeFunc
on multiple threads that points to the same underlying function. In order for the WasmerEnv
to be able to handle being used this way, it must be both Send
and Sync
. This shouldn't affect much, if any, real code. For example, WASI was not affected at all by this change. Emscripten was only affected because it was doing some weird stuff with pointers.
bors r+ |
TODO: link to issue
This PR contains a number of changes:
WasmerEnv: Clone
clone
function when creating a host function (Notably this is a feature that wouldn't work even if we could use a proper trait object because you can't have aSized
trait object andClone: Sized
).drop
function when creating a host function.Instance
is made. Therefore eachInstance
gets its own, uniqueEnv
per host function withEnv
.wasmer_export::ExportFunction
which frees the original version of theEnv
(the thing that gets cloned each time anInstance
is made) with thedrop
function pointer.vm::Instance
from SoA (struct of arrays) to AoS (array of structs): this uses more memory but is a bit less error prone and can be easily changed later.vm::ImportEnv
) that contains the function pointers for each import inInstance
to drop (with thedrop
fn pointer) when thevm::Instance
is being dropped. This fixes the original memory leak.drop
beingEnv::drop
, it's a function which frees all wrapper types, traverses indirections and frees the internalEnv
withEnv::drop
. This simplifies code at the cost of making thehost_env
pointer (vmctx
) not consistent in terms of what it actually points to. This change fixes another memory leak related to the creation of host functions.tl;dr: we're leaning into manually doing virtual method dispatch on
WasmerEnv
s and it actually works great! The biggest issue I have with the PR as-is is that the code isn't as clean/readable/robust as I'd ideally like it to be.Edit (by @Hywan): This PR fixes #1584, #1714, #1865, #1667.
Review