-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Non blocking OlapTableSink #3143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Do you have some benchmark for this PR? |
I only have two test samples now, testing environment building is in progress in our team. I use broker load, and the scan node wasn't do mem limit(my test code doesn't contain a80e9bf). If scan node waits when mem limit exceeded, it must take longer. 5 be:
|
|
Compilation failed: There is no |
Oops, my bad. Fixed. |
|
Hi @vagetablechicken , please resolve the conflict. |
|
I wanna change yeild() to sleep(), to avoid CPU busy. I need to prove "even sleep() after each loop, non-blocking sink is faster than blocking one" through testing. One more thing, the add_row() may be blocked cause mem limit exceeded, I'll add the blocking time to time profile. Your comments are welcome. |
|
@imay @morningman If the case is large, it get faster obviously, so I focus on little cases this time. So And the test result(interval=10ms) is
|
be/src/exec/tablet_sink.h
Outdated
| const NodeInfo* node_info() const { return _node_info; } | ||
| std::string print_load_info() const { return _load_info; } | ||
| std::string name() const { | ||
| return "NodeChannel[" + std::to_string(_index_id) + "-" + std::to_string(_node_id) + "]"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer strings::Substitute in gutils/strings/Substitute.h.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
including gutils/strings/Substitute.h will occur redefinition error
In file included from /root/incubator-doris/be/src/gutil/strings/stringpiece.h:128:0,
from /root/incubator-doris/be/src/gutil/strings/substitute.h:9,
from /root/incubator-doris/be/src/exec/tablet_sink.cpp:28:
/root/incubator-doris/be/src/gutil/hash/hash.h:251:26: error: redefinition of 'struct __gnu_cxx::hash<Type*>'
template<class T> struct hash<T*> {
^~~~~~~~
In file included from /root/incubator-doris/thirdparty/installed/include/butil/containers/flat_map.h:101:0,
from /root/incubator-doris/be/src/service/brpc.h:48,
from /root/incubator-doris/be/src/util/ref_count_closure.h:24,
from /root/incubator-doris/be/src/exec/tablet_sink.h:36,
from /root/incubator-doris/be/src/exec/tablet_sink.cpp:18:
/root/incubator-doris/thirdparty/installed/include/butil/containers/hash_tables.h:262:8: note: previous definition of 'struct __gnu_cxx::hash<Type*>'
struct hash<Type*> {
^~~~~~~~~~~
In file included from /root/incubator-doris/be/src/gutil/strings/stringpiece.h:128:0,
from /root/incubator-doris/be/src/gutil/strings/substitute.h:9,
from /root/incubator-doris/be/src/exec/tablet_sink.cpp:28:
/root/incubator-doris/be/src/gutil/hash/hash.h:309:8: error: redefinition of 'struct __gnu_cxx::hash<std::pair<_T1, _T2> >'
struct hash<pair<First, Second> > {
^~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /root/incubator-doris/thirdparty/installed/include/butil/containers/flat_map.h:101:0,
from /root/incubator-doris/be/src/service/brpc.h:48,
from /root/incubator-doris/be/src/util/ref_count_closure.h:24,
from /root/incubator-doris/be/src/exec/tablet_sink.h:36,
from /root/incubator-doris/be/src/exec/tablet_sink.cpp:18:
/root/incubator-doris/thirdparty/installed/include/butil/containers/hash_tables.h:256:8: note: previous definition of 'struct __gnu_cxx::hash<std::pair<_T1, _T2> >'
struct hash<std::pair<Type1, Type2> > {
I will initialize the name_string in init(), to avoid string building in name()
d130528 to
c9c2741
Compare
This reverts commit a74eeca.
Co-Authored-By: Zhao Chun <buaa.zhaoc@gmail.com>
c9c2741 to
2f7eac8
Compare
imay
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM.
If there is no other comments, I will merge it tomorrow.
This reverts commit 94539e7.
ImplementaItion Notes NodeChannel _cur_batch -> _pending_batches: when _cur_batch is filled up, move it to _pending_batches. add_row() just produce batches. try_send_and_fetch_status() tries to consume one pending batch. If has in flight packet, skip send in this round. So we can add one sender thread to be in charge of all node channels try_send. IndexChannel init(), open() stay the same. Use for_each_node_channel() to expose the detailed changes of NodeChannel.(It's more easy to read & modify) Sender thread See func OlapTableSink::_send_batch_process() Why use polling? If we use wait/notify, it will notify when generate a new batch. We can't skip sending this batch, coz it won't notify the same batch again. So wait/notify can't avoid blocking simply. So I choose polling. It's wasting to continuously try_send(), but it's difficult to set the suitable polling interval. Thus, I add std::this_thread::yield() to give up the time slice, give priority to other process/threads (if there are other process/threads waiting in the queue).
Ref #2780 (comment)
ImplementaItion Notes
NodeChannel
So we can add one sender thread to be in charge of all node channels try_send.
IndexChannel
Sender thread
See func OlapTableSink::_send_batch_process()
Why use polling?
If we use wait/notify, it will notify when generate a new batch. We can't skip sending this batch, coz it won't notify the same batch again. So wait/notify can't avoid blocking simply.
So I choose polling.
It's wasting to continuously try_send(), but it's difficult to set the suitable polling interval. Thus, I add std::this_thread::yield() to give up the time slice, give priority to other process/threads (if there are other process/threads waiting in the queue).