Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lua: Change the TLS callback function type of ThreadLocalState to Upd… #11944

Merged
merged 10 commits into from
Jul 15, 2020

Conversation

wbpcode
Copy link
Member

@wbpcode wbpcode commented Jul 8, 2020

Signed-off-by: wbpcode comems@msn.com

Change the type of ThreadLocalState's TLS callback to UpdateCb. Through this method, we can avoid capturing this (ThreadLocalState instance) in the callback function, and avoid memory security problems caused by the inconsistency between the lifetime of the ThreadLocalSate instance and the lifetime of the callback function.

This modification is at least harmless.

Commit Message:
Additional Description:
Risk Level:
Testing:
Docs Changes:
Release Notes:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Deprecated:]

…ateCb

Signed-off-by: wbpcode <comems@msn.com>
@wbpcode
Copy link
Member Author

wbpcode commented Jul 8, 2020

A related issue is #10241. But I can't fully confirm that this PR will definitely solve the issue. Because I cannot reproduce the issue directly in my environment.

The only way I found to reproduce the issue is to cancel the repeated listener check first, and then add two listeners with the same name in the first xDS response, which will trigger this crash. In this scenario, the crash problem was indeed solved by this PR.

@dio
Copy link
Member

dio commented Jul 8, 2020

Looks good. Could you provide a test for this case, please? Thanks!

@wbpcode
Copy link
Member Author

wbpcode commented Jul 9, 2020

Looks good. Could you provide a test for this case, please? Thanks!

I think for this case, it may be difficult to design a good test case, because it is difficult to reproduce this crash. Anyway, I will try it.

Signed-off-by: wbpcode <comems@msn.com>
@wbpcode
Copy link
Member Author

wbpcode commented Jul 9, 2020

Add a new test for ThreadLocalState. @dio

@wbpcode
Copy link
Member Author

wbpcode commented Jul 9, 2020

If the type of callback is not modified, running this test directly will result in a crash (as follows). After changing the type of the callback function to UpdateCb, the test can pass safely.

[ RUN      ] ThreadSafeTest.StateDestructedBeforeWorkerRun
[2020-07-09 10:23:51.292][18][critical][backtrace] [bazel-out/k8-fastbuild/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:104] Caught Segmentation fault, suspect faulting address 0x3d055e0
[2020-07-09 10:23:51.292][18][critical][backtrace] [bazel-out/k8-fastbuild/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:91] Backtrace (use tools/stack_decode.py to get line numbers):
[2020-07-09 10:23:51.292][18][critical][backtrace] [bazel-out/k8-fastbuild/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:92] Envoy version: 0/1.16.0-dev/redacted/DEBUG/BoringSSL
[2020-07-09 10:23:51.305][18][critical][backtrace] [bazel-out/k8-fastbuild/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #0: Envoy::SignalAction::sigHandler() [0xad007c]
[2020-07-09 10:23:51.305][18][critical][backtrace] [bazel-out/k8-fastbuild/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #1: __restore_rt [0x7f676f3418a0]
[2020-07-09 10:23:51.318][18][critical][backtrace] [bazel-out/k8-fastbuild/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #2: Envoy::Extensions::Filters::Common::Lua::ThreadLocalState::registerType<>()::{lambda()#1}::operator()() [0x441c30]
[2020-07-09 10:23:51.330][18][critical][backtrace] [bazel-out/k8-fastbuild/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #3: std::_Function_handler<>::_M_invoke() [0x441aed]
[2020-07-09 10:23:51.343][18][critical][backtrace] [bazel-out/k8-fastbuild/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #4: std::function<>::operator()() [0x47c99e]
[2020-07-09 10:23:51.355][18][critical][backtrace] [bazel-out/k8-fastbuild/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #5: Envoy::ThreadLocal::InstanceImpl::Bookkeeper::runOnAllThreads()::$_5::operator()() [0x4c4575]
[2020-07-09 10:23:51.368][18][critical][backtrace] [bazel-out/k8-fastbuild/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #6: std::_Function_handler<>::_M_invoke() [0x4c43bd]
[2020-07-09 10:23:51.368][18][critical][backtrace] [bazel-out/k8-fastbuild/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #7: std::function<>::operator()() [0x47c99e]
[2020-07-09 10:23:51.380][18][critical][backtrace] [bazel-out/k8-fastbuild/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #8: Envoy::Event::DispatcherImpl::runPostCallbacks() [0x6b1bed]
[2020-07-09 10:23:51.393][18][critical][backtrace] [bazel-out/k8-fastbuild/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #9: Envoy::Event::DispatcherImpl::run() [0x6b1b05]
[2020-07-09 10:23:51.406][18][critical][backtrace] [bazel-out/k8-fastbuild/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #10: Envoy::Extensions::Filters::Common::Lua::(anonymous namespace)::ThreadSafeTest_StateDestructedBeforeWorkerRun_Test::TestBody()::$_0::operator()() [0x445ecb]
[2020-07-09 10:23:51.418][18][critical][backtrace] [bazel-out/k8-fastbuild/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #11: std::_Function_handler<>::_M_invoke() [0x445d6d]
[2020-07-09 10:23:51.418][18][critical][backtrace] [bazel-out/k8-fastbuild/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #12: std::function<>::operator()() [0x47c99e]
[2020-07-09 10:23:51.431][18][critical][backtrace] [bazel-out/k8-fastbuild/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #13: Envoy::Thread::ThreadImplPosix::ThreadImplPosix()::{lambda()#1}::operator()() [0x1670b92]
[2020-07-09 10:23:51.443][18][critical][backtrace] [bazel-out/k8-fastbuild/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #14: Envoy::Thread::ThreadImplPosix::ThreadImplPosix()::{lambda()#1}::__invoke() [0x1670b65]
[2020-07-09 10:23:51.443][18][critical][backtrace] [bazel-out/k8-fastbuild/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #15: start_thread [0x7f676f3366db]

wbpcode added 2 commits July 9, 2020 20:13
Signed-off-by: wbpcode <comems@msn.com>
Signed-off-by: wbpcode <comems@msn.com>
Copy link
Member

@dio dio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. Flushing some comments and questions.

@@ -71,8 +71,9 @@ int ThreadLocalState::getGlobalRef(uint64_t slot) {
}

uint64_t ThreadLocalState::registerGlobal(const std::string& global) {
tls_slot_->runOnAllThreads([this, global]() {
LuaThreadLocal& tls = tls_slot_->getTyped<LuaThreadLocal>();
tls_slot_->runOnAllThreads([global](ThreadLocal::ThreadLocalObjectSharedPtr ptr) -> auto {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does spelling out -> auto (-> ThreadLocal::ThreadLocalObjectSharedPtr) here will be a problem for clang-tidy?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And let's rename ptr as previous?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the existing code, it is acceptable to spell out the return type or not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, I will remove it to make the code more concise.

tls_slot_->runOnAllThreads([this, global]() {
LuaThreadLocal& tls = tls_slot_->getTyped<LuaThreadLocal>();
tls_slot_->runOnAllThreads([global](ThreadLocal::ThreadLocalObjectSharedPtr ptr) -> auto {
ASSERT(std::dynamic_pointer_cast<LuaThreadLocal>(ptr));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this assert here?

// Start a new worker thread to execute the callback functions in the worker dispatcher.
Thread::ThreadPtr thread = Thread::threadFactoryForTest().createThread([this]() {
worker_dispatcher_->run(Event::Dispatcher::RunType::Block);
// Verify we have the expected dispatcher for the new thread thread.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the new worker thread?

@dio
Copy link
Member

dio commented Jul 9, 2020

@snowp, when you have time, could you help to take a look on the test validity? Thanks!

@dio
Copy link
Member

dio commented Jul 10, 2020

#12018 is merged. Please sync your branch with head to fix the CI (Linux-x64 asan) issue. Thanks!

@wbpcode
Copy link
Member Author

wbpcode commented Jul 10, 2020

#12018 is merged. Please sync your branch with head to fix the CI (Linux-x64 asan) issue. Thanks!
Get it.

@wbpcode
Copy link
Member Author

wbpcode commented Jul 14, 2020

Could you take the time to push this PR to the next step? Thanks. @snowp

Copy link
Contributor

@snowp snowp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks this makes sense, just one question.

};

// Test whether ThreadLocalState can be safely released.
TEST_F(ThreadSafeTest, StateDestructedBeforeWorkerRun) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this test fail without the code change in this PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it will be a crash.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for verifying!

@snowp snowp self-assigned this Jul 14, 2020
@repokitteh-read-only
Copy link

🤷‍♀️ nothing to rebuild.

🐱

Caused by: a #11944 (comment) was created by @wbpcode.

see: more, trace.

Copy link
Contributor

@snowp snowp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@snowp snowp merged commit dee8b8d into envoyproxy:master Jul 15, 2020
KBaichoo pushed a commit to KBaichoo/envoy that referenced this pull request Jul 30, 2020
envoyproxy#11944)

Change the type of ThreadLocalState's TLS callback to UpdateCb. Through this method, we can avoid capturing this (ThreadLocalState instance) in the callback function, and avoid memory security problems caused by the inconsistency between the lifetime of the ThreadLocalSate instance and the lifetime of the callback function.

Signed-off-by: wbpcode <comems@msn.com>
Signed-off-by: Kevin Baichoo <kbaichoo@google.com>
scheler pushed a commit to scheler/envoy that referenced this pull request Aug 4, 2020
envoyproxy#11944)

Change the type of ThreadLocalState's TLS callback to UpdateCb. Through this method, we can avoid capturing this (ThreadLocalState instance) in the callback function, and avoid memory security problems caused by the inconsistency between the lifetime of the ThreadLocalSate instance and the lifetime of the callback function.

Signed-off-by: wbpcode <comems@msn.com>
Signed-off-by: scheler <santosh.cheler@appdynamics.com>
lambdai added a commit to istio/envoy that referenced this pull request May 21, 2021
Lua: Change the TLS callback function type of ThreadLocalState to Upd… (envoyproxy#11944)
Update LuaJIT patch - remove MAP_32BIT (envoyproxy#10867)

Signed-off-by: wbpcode <comems@msn.com>
Signed-off-by: John Murray <me@johnmurray.io>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Change-Id: I62739dc1c3250fcff755d0a6208ec4f78cda695b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants