-
Notifications
You must be signed in to change notification settings - Fork 654
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Protect access to exec env #1991
Protect access to exec env #1991
Conversation
0f31f28
to
9df7a5b
Compare
9df7a5b
to
de86504
Compare
d44f851
to
5550b56
Compare
ret = wasm_cluster_create_thread(exec_env, new_module_inst, false, | ||
thread_start, thread_start_arg); | ||
if (ret != 0) { | ||
LOG_ERROR("Failed to spawn a new thread"); | ||
goto thread_spawn_fail; | ||
} | ||
os_mutex_unlock(&exec_env->wait_lock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there data race here?
Had better not remove the lock, instead, add a cond_wait like pthread_create_wrapper in lib_pthread_wrapper.c:
wasm-micro-runtime/core/iwasm/libraries/lib-pthread/lib_pthread_wrapper.c
Lines 665 to 668 in 38c67b3
/* Wait for the thread routine to assign the exec_env to | |
thread_info_node, otherwise the exec_env in the thread | |
info node may be NULL in the next pthread API call */ | |
os_cond_wait(&exec_env->wait_cond, &exec_env->wait_lock); |
And in the thread_start
, add cond_signal like pthread_start_routine:
os_cond_signal(&parent_exec_env->wait_cond); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason why I removed it is that, after getting the exec_env
lock there, the cluster
lock gets acquired
os_mutex_lock(&cluster->lock); |
but then an attempt to get the
exec_env
lock is made again at https://github.com/eloparco/wasm-micro-runtime/blob/5550b5647037a7cc9674dda03413a95e3c06cba3/core/iwasm/libraries/thread-mgr/thread_manager.c#L946 generating a warning in the sanitizer for potential deadlock.
Let me see how it can be fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that the thread isn't actually started, but it is already added to the cluster's exec_env list, and another thread (seems main thread) tries to terminate it? Maybe after adding the similar logic like pthread_create_wrapper, the issue can be resolved since the main thread will wait unit the child thread actually started.
Here try adding cond_wait:
os_mutex_lock(&exec_env->wait_lock);
ret = wasm_cluster_create_thread(exec_env, new_module_inst, false,
thread_start, thread_start_arg);
...
os_cond_wait(&exec_env->wait_cond, &exec_env->wait_lock);
os_mutex_unlock(&exec_env->wait_lock);
And in thread_start:
os_mutex_lock(&parent_exec_env->wait_lock);
...
os_cond_signal(&parent_exec_env->wait_cond, &parent_exec_env->wait_lock);
os_mutex_unlock(&parent_exec_env->wait_lock);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It actually only happens when it's the spawned thread that calls proc_exit
.
I see that for pthreads that lock is used to wait for the update of ThreadInfoNode
.
wasm-micro-runtime/core/iwasm/libraries/lib-pthread/lib_pthread_wrapper.c
Lines 665 to 668 in 38c67b3
/* Wait for the thread routine to assign the exec_env to | |
thread_info_node, otherwise the exec_env in the thread | |
info node may be NULL in the next pthread API call */ | |
os_cond_wait(&exec_env->wait_cond, &exec_env->wait_lock); |
In
lib_wasi_threads_wrapper.c
we don't use ThreadInfoNode
so I don't think the lock there is even needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, got it, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, once this one is merged, we can merge #1985 too
5550b56
to
15a2826
Compare
15a2826
to
3ff2609
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
We have observed a significant performance degradation after merging bytecodealliance#1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
We have observed a significant performance degradation after merging bytecodealliance#1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
We have observed a significant performance degradation after merging bytecodealliance#1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
We have observed a significant performance degradation after merging bytecodealliance#1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
We have observed a significant performance degradation after merging bytecodealliance#1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
We have observed a significant performance degradation after merging bytecodealliance#1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
We have observed a significant performance degradation after merging bytecodealliance#1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
We have observed a significant performance degradation after merging bytecodealliance#1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
We have observed a significant performance degradation after merging bytecodealliance#1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
We have observed a significant performance degradation after merging bytecodealliance#1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
We have observed a significant performance degradation after merging bytecodealliance#1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
We have observed a significant performance degradation after merging bytecodealliance#1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
We have observed a significant performance degradation after merging bytecodealliance#1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
We have observed a significant performance degradation after merging bytecodealliance#1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
We have observed a significant performance degradation after merging bytecodealliance#1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
We have observed a significant performance degradation after merging bytecodealliance#1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
We have observed a significant performance degradation after merging bytecodealliance#1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
We have observed a significant performance degradation after merging bytecodealliance#1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
We have observed a significant performance degradation after merging bytecodealliance#1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
We have observed a significant performance degradation after merging bytecodealliance#1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
We have observed a significant performance degradation after merging bytecodealliance#1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
We have observed a significant performance degradation after merging bytecodealliance#1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
We have observed a significant performance degradation after merging bytecodealliance#1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
We have observed a significant performance degradation after merging bytecodealliance#1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
We have observed a significant performance degradation after merging bytecodealliance#1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
We have observed a significant performance degradation after merging bytecodealliance#1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
We have observed a significant performance degradation after merging #1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
…iance#1991) Data race may occur when accessing exec_env's fields, e.g. suspend_flags and handle. Add lock `exec_env->wait_lock` for them to resolve the issue.
We have observed a significant performance degradation after merging bytecodealliance#1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.
Fix concurrent access to exec env: it is accessed by the thread setting the suspend flags (to stop other threads) and threads (in the same cluster) reading it to check for termination.
Tested it by using the thread sanitizer and running WASI tests in the proposal https://github.com/WebAssembly/wasi-threads/tree/main/test/testsuite.