-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support inline Lua code in per filter config #12516
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks this is awesome. A few comments to get started.
/wait
PerLuaCodeSetup(const std::string& lua_code, ThreadLocal::SlotAllocator& tls); | ||
PerLuaCodeSetup(const std::string& lua_code, ThreadLocal::SlotAllocator& tls, | ||
Event::Dispatcher& dispatcher); | ||
~PerLuaCodeSetup() { dispatcher_.deferredDelete(std::move(lua_state_)); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was the reason you added deferred delete here? I think this part is a little tricky unfortunately.
Basically, I think it's possible for the route table to get removed from the main thread but still be stored by an active request. Thus, the request may hold that last reference to the route table and then when it's over it will destroy the route table and the per-lua code. I think this means that this may happen cross thread. Is deferred delete safe cross thread? (It may be I would need to check.) Anyway this is the part that needs very careful thought. Can you describe what you did here and add more comments so we can discuss?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This 'dispatcher_' belongs to the main thread. And add this deferred delete will move TheadLocalState(lua_state_) and destroy it on the main thread.
In this case, no matter on which thread the route table is destroyed, we can guarantee that TheadLocalState(lua_state_) will only be destroyed in the main thread.
In this way, we can ensure that the state of TLS will not be modified on the worker thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rechecked the source code and found that deferredDelete
itself is not thread-safe, and it cannot safe cross thread.
Perhaps we can simply use post callback to let the main thread destroy TheadLocalState(lua_state_).
Just like following code:
~PerLuaCodeSetup() {
dispatcher_.post([state = std::move(lua_state_)]() {});
// dispatcher_.deferredDelete(std::move(lua_state_));
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible this will work but the threading would need to be analyzed. Do you want to try this out and we can take a look?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I will update it this weekend. My own branch also needs to verify this function in the production environment immediately.
It took more time to complete this work than expected. Especially to resolve TSAN and ASAN alarms. Now, it can accept review. An integration test is added to verify thread safety issues in the case of route table updates. @mattklein123 |
if (lua_state_.getGlobalRef(request_function_slot_) == LUA_REFNIL) { | ||
PerLuaCodeSetup::PerLuaCodeSetup(const std::string& lua_code, ThreadLocal::SlotAllocator& tls, | ||
Event::Dispatcher& dispatcher) | ||
: lua_state_(new Filters::Common::Lua::ThreadLocalState{lua_code, tls}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A question, for my own education, why do we need to have lua_state_ as a pointer here? Since this will be still owned by this class right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For thread safety, ThreadLocalState can only be destructured in the main thread. However, when the route table is dynamically updated, the old route configuration may be destroyed in a random worker thread. Its PerLuaCodeSetup will also be destructured in the corresponding worker thread. By converting lua_state_ to a pointer, we can push the destruction of lua_state_ into the main thread with a callback during PerLuaCodeSetup destructuring.
virtual ~PerLuaCodeSetup() {
if (dispatcher_.isThreadSafe()) {
return;
}
dispatcher_.post([state = lua_state_.release()] { delete state; });
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha! I have a new idea. Thank you very much for this question @dio . It reminded me that there's actually a different way to think about making the code a little cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When this is ready for review can you remove [WIP] from the title also? Thank you.
/wait
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All right. 🆗
Signed-off-by: wbpcode <comems@msn.com>
Signed-off-by: wbpcode <comems@msn.com>
It's ready for a review. Thanks. @mattklein123 @dio |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks this is fantastic. Nicely done. Just a few comments.
/wait
// lua_State, some static variables are used to assist memory allocation. Creating lua_State in | ||
// different threads at the same time may cause data race. Although 'LuaJIT' uses retries to | ||
// ensure the final safety of the memory allocation, it may still trigger the TSAN alarm. | ||
static Thread::MutexBasicLockable lua_thread_local_lock; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
potential static fiasco here. Please use CONSTRUCT_ON_FIRST_USE
|
||
~FilterConfigPerRoute() override { | ||
if (per_lua_code_setup_ptr_ && !main_thread_dispatcher_.isThreadSafe()) { | ||
main_thread_dispatcher_.post([setup = per_lua_code_setup_ptr_.release()] { delete setup; }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this is what is needed, but can you add more comments on why this is necessary? Technically I think it's possible this might leak during shutdown but it probably doesn't matter. If you want to avoid a leak no matter what, create a shared_ptr wrapper, put that in the lambda, and then move the unique_ptr inside of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I will add some short comment to explain the necessity and try to fix the possible memory leak.
Signed-off-by: wbpcode <comems@msn.com>
Filed #12896 for the CI issue. Sorry I think we have a unicode issue that needs to be fixed. /wait |
Can you merge main? CI should be fixed. Thank you @asraa! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks looks great. One small comment/question for discussion.
/wait-any
// different threads at the same time may cause data race. Although 'LuaJIT' uses retries to | ||
// ensure the final safety of the memory allocation, it may still trigger the TSAN alarm. | ||
Thread::LockGuard guard(luaThreadLocalLock()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah let's just do that. Please have a good comment on why we are disabling TSAN for the test.
/wait
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok 😄
Signed-off-by: wbpcode <comems@msn.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work!
Signed-off-by: wbpcode comems@msn.com
Commit Message:
Additional Description:
This is a supplement to PR #11235. Check #12460 for more info.
Risk Level: Low
Testing:
Docs Changes:
Release Notes:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Deprecated:]