-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve encapsulation of expression evaluation classes #7
Conversation
@mbasmanova has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
|
||
void setDictionaryWrap(BufferPtr wrap, BufferPtr wrapNulls) { | ||
wrapEncoding_ = VectorEncoding::Simple::DICTIONARY; | ||
wrap_ = std::move(wrap); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Since wrap BufferPtr and wrapNulls BufferPtr can be set as null, would it make sense to wrap them in a folly::Optional ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- wrap cannot be null; wrapNulls can be
- we should not be using folly::Optional, but std::optional; however, since optional is not used for similar cases elsewhere in the codebase, I'd prefer not to introduce it here
} | ||
|
||
void setConstantWrap(vector_size_t wrapIndex) { | ||
wrapEncoding_ = VectorEncoding::Simple::CONSTANT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: assert wrapindex atleast > 0 since its a constant encoding.
5aa09fa
to
a596e8e
Compare
@mbasmanova has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@mbasmanova merged this pull request in c672df8. |
Co-authored-by: Ferdinand Xu <cheng.a.xu@intel.com>
Co-authored-by: Ferdinand Xu <cheng.a.xu@intel.com>
…ookincubator#7) * Pass project Tests for round-trip trans when batchSize=1 * clean some debug info * change code style and using log instead of cout * Pass project Tests for round-trip plan transform * use full names for more readable in tests * Pass FilterNode Tests for round-trip plan transform * [POAE7-1448] Add AggregateNode, nullValue APIs and pass six tests about transform from velox to substrait * address the comments * [POAE7-1448] pass the round-trip test of aggregatesNode * address comments and update the url of substrait submodule * [POAE7-1497] The header and CMake files after refine * [POAE7-1497] The header and CMake files after refine * address comments * [POAE7-1497]the TypeConvertor of Substrait utils code refactor * [POAE7-1497]the GlobalCommonVariable of Substrait utils code refactor * remove header file except TypeConvertor * make global variables into a singleton class
Summary: There was a race condition that caused ASAN heap-use-after-free in Driver::runInternal. It is possible for the Task to terminate (due to an error) while Driver is off-thread sitting in the executor's queue waiting to be scheduled to run on a thread. If this case, Task may have been terminated and operators have been destroyed just before Driver::runInternal executes. In this case `operators_[curOpIndex_]->stats().addRuntimeStat ("queuedWallNanos", ...);` call would try to access freed memory. The fix is to (1) streamline Driver by removing task_ member variable and use DriverCtx::task instead; (2) move logic for updating stats after the call to task->enter(). If task has terminated, enter() will return StopReason::kTerminate and no Driver will abort any further processing. ```AddressSanitizer: heap-use-after-free #6 0x7f01c29ae9b9 in facebook::velox::exec::OperatorStats::addRuntimeStat(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, long) velox/exec/Operator.h:144 #7 0x7f01c29afaf7 in facebook::velox::exec::Driver::runInternal(std::shared_ptr<facebook::velox::exec::Driver>&, std::shared_ptr<facebook::velox::exec::BlockingState>*) velox/exec/Driver.cpp:249 #8 0x7f01c29b4174 in facebook::velox::exec::Driver::run(std::shared_ptr<facebook::velox::exec::Driver>) velox/exec/Driver.cpp:415 #9 0x7f01c29cce5f in facebook::velox::exec::Driver::enqueue(std::shared_ptr<facebook::velox::exec::Driver>)::$_0::operator()() const velox/exec/Driver.cpp:161 ``` Pull Request resolved: #1219 Test Plan: $ buck test mode/dev-asan //presto_cpp/main/tests:presto_main_test -- --exact 'presto_cpp/main/tests:presto_main_test - TaskManagerTest.outOfQueryUserMemory' --run-disabled --stress-runs 1000 ``` Summary Pass: 1000 ListingSuccess: 1 Finished test run: https://www.internalfb.com/intern/testinfra/testrun/1125900137408748 ``` Imported from GitHub, without a `Test Plan:` line. Reviewed By: spershin, oerling Differential Revision: D34904220 Pulled By: mbasmanova fbshipit-source-id: d4b88cf9e3ec2884c77371c37c571297a59728b6
Summary: There was a race condition that caused ASAN heap-use-after-free in Driver::runInternal. It is possible for the Task to terminate (due to an error) while Driver is off-thread sitting in the executor's queue waiting to be scheduled to run on a thread. If this case, Task may have been terminated and operators have been destroyed just before Driver::runInternal executes. In this case `operators_[curOpIndex_]->stats().addRuntimeStat ("queuedWallNanos", ...);` call would try to access freed memory. The fix is to (1) streamline Driver by removing task_ member variable and use DriverCtx::task instead; (2) move logic for updating stats after the call to task->enter(). If task has terminated, enter() will return StopReason::kTerminate and no Driver will abort any further processing. ```AddressSanitizer: heap-use-after-free facebookincubator#6 0x7f01c29ae9b9 in facebook::velox::exec::OperatorStats::addRuntimeStat(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, long) velox/exec/Operator.h:144 facebookincubator#7 0x7f01c29afaf7 in facebook::velox::exec::Driver::runInternal(std::shared_ptr<facebook::velox::exec::Driver>&, std::shared_ptr<facebook::velox::exec::BlockingState>*) velox/exec/Driver.cpp:249 facebookincubator#8 0x7f01c29b4174 in facebook::velox::exec::Driver::run(std::shared_ptr<facebook::velox::exec::Driver>) velox/exec/Driver.cpp:415 facebookincubator#9 0x7f01c29cce5f in facebook::velox::exec::Driver::enqueue(std::shared_ptr<facebook::velox::exec::Driver>)::$_0::operator()() const velox/exec/Driver.cpp:161 ``` Pull Request resolved: facebookincubator#1219 Test Plan: $ buck test mode/dev-asan //presto_cpp/main/tests:presto_main_test -- --exact 'presto_cpp/main/tests:presto_main_test - TaskManagerTest.outOfQueryUserMemory' --run-disabled --stress-runs 1000 ``` Summary Pass: 1000 ListingSuccess: 1 Finished test run: https://www.internalfb.com/intern/testinfra/testrun/1125900137408748 ``` Imported from GitHub, without a `Test Plan:` line. Reviewed By: spershin, oerling Differential Revision: D34904220 Pulled By: mbasmanova fbshipit-source-id: d4b88cf9e3ec2884c77371c37c571297a59728b6
Summary: Enhance printExprWithStats to identify common-sub expressions. For example, `c0 + c1` is a common sub-expression in `"(c0 + c1) % 5", " (c0 + c1) % 3"` expression set. It is evaluated only once and there is a single Expr object that represents it. That object appears in the expression tree twice. printExprWithStats does not show the runtime stats for second instance of that expression and instead annotates it with `[CSE https://github.com/facebookincubator/velox/issues/2]`, where CSE stands for common sub-expression and 2 refers to the first instance of the expression. ``` mod [cpu time: 50.49us, rows: 1024] -> BIGINT [#1] cast(plus as BIGINT) [cpu time: 68.15us, rows: 1024] -> BIGINT [#2] plus [cpu time: 51.84us, rows: 1024] -> INTEGER [#3] c0 [cpu time: 0ns, rows: 0] -> INTEGER [#4] c1 [cpu time: 0ns, rows: 0] -> INTEGER [#5] 5:BIGINT [cpu time: 0ns, rows: 0] -> BIGINT [#6] mod [cpu time: 49.29us, rows: 1024] -> BIGINT [#7] cast((plus(c0, c1)) as BIGINT) -> BIGINT [CSE #2] 3:BIGINT [cpu time: 0ns, rows: 0] -> BIGINT [#8] ``` Pull Request resolved: #1500 Reviewed By: Yuhta Differential Revision: D35994836 Pulled By: mbasmanova fbshipit-source-id: 6bacbbe61b68dad97ce2fd5f99610c4ad55897be
…tor#1500) Summary: Enhance printExprWithStats to identify common-sub expressions. For example, `c0 + c1` is a common sub-expression in `"(c0 + c1) % 5", " (c0 + c1) % 3"` expression set. It is evaluated only once and there is a single Expr object that represents it. That object appears in the expression tree twice. printExprWithStats does not show the runtime stats for second instance of that expression and instead annotates it with `[CSE https://github.com/facebookincubator/velox/issues/2]`, where CSE stands for common sub-expression and 2 refers to the first instance of the expression. ``` mod [cpu time: 50.49us, rows: 1024] -> BIGINT [#1] cast(plus as BIGINT) [cpu time: 68.15us, rows: 1024] -> BIGINT [facebookincubator#2] plus [cpu time: 51.84us, rows: 1024] -> INTEGER [facebookincubator#3] c0 [cpu time: 0ns, rows: 0] -> INTEGER [facebookincubator#4] c1 [cpu time: 0ns, rows: 0] -> INTEGER [facebookincubator#5] 5:BIGINT [cpu time: 0ns, rows: 0] -> BIGINT [facebookincubator#6] mod [cpu time: 49.29us, rows: 1024] -> BIGINT [facebookincubator#7] cast((plus(c0, c1)) as BIGINT) -> BIGINT [CSE facebookincubator#2] 3:BIGINT [cpu time: 0ns, rows: 0] -> BIGINT [facebookincubator#8] ``` Pull Request resolved: facebookincubator#1500 Reviewed By: Yuhta Differential Revision: D35994836 Pulled By: mbasmanova fbshipit-source-id: 6bacbbe61b68dad97ce2fd5f99610c4ad55897be
Summary: There was a race condition that caused ASAN heap-use-after-free in Driver::runInternal. It is possible for the Task to terminate (due to an error) while Driver is off-thread sitting in the executor's queue waiting to be scheduled to run on a thread. If this case, Task may have been terminated and operators have been destroyed just before Driver::runInternal executes. In this case `operators_[curOpIndex_]->stats().addRuntimeStat ("queuedWallNanos", ...);` call would try to access freed memory. The fix is to (1) streamline Driver by removing task_ member variable and use DriverCtx::task instead; (2) move logic for updating stats after the call to task->enter(). If task has terminated, enter() will return StopReason::kTerminate and no Driver will abort any further processing. ```AddressSanitizer: heap-use-after-free facebookincubator#6 0x7f01c29ae9b9 in facebook::velox::exec::OperatorStats::addRuntimeStat(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, long) velox/exec/Operator.h:144 facebookincubator#7 0x7f01c29afaf7 in facebook::velox::exec::Driver::runInternal(std::shared_ptr<facebook::velox::exec::Driver>&, std::shared_ptr<facebook::velox::exec::BlockingState>*) velox/exec/Driver.cpp:249 facebookincubator#8 0x7f01c29b4174 in facebook::velox::exec::Driver::run(std::shared_ptr<facebook::velox::exec::Driver>) velox/exec/Driver.cpp:415 facebookincubator#9 0x7f01c29cce5f in facebook::velox::exec::Driver::enqueue(std::shared_ptr<facebook::velox::exec::Driver>)::$_0::operator()() const velox/exec/Driver.cpp:161 ``` Pull Request resolved: facebookincubator#1219 Test Plan: $ buck test mode/dev-asan //presto_cpp/main/tests:presto_main_test -- --exact 'presto_cpp/main/tests:presto_main_test - TaskManagerTest.outOfQueryUserMemory' --run-disabled --stress-runs 1000 ``` Summary Pass: 1000 ListingSuccess: 1 Finished test run: https://www.internalfb.com/intern/testinfra/testrun/1125900137408748 ``` Imported from GitHub, without a `Test Plan:` line. Reviewed By: spershin, oerling Differential Revision: D34904220 Pulled By: mbasmanova fbshipit-source-id: d4b88cf9e3ec2884c77371c37c571297a59728b6
…tor#1500) Summary: Enhance printExprWithStats to identify common-sub expressions. For example, `c0 + c1` is a common sub-expression in `"(c0 + c1) % 5", " (c0 + c1) % 3"` expression set. It is evaluated only once and there is a single Expr object that represents it. That object appears in the expression tree twice. printExprWithStats does not show the runtime stats for second instance of that expression and instead annotates it with `[CSE https://github.com/facebookincubator/velox/issues/2]`, where CSE stands for common sub-expression and 2 refers to the first instance of the expression. ``` mod [cpu time: 50.49us, rows: 1024] -> BIGINT [facebookincubator#1] cast(plus as BIGINT) [cpu time: 68.15us, rows: 1024] -> BIGINT [facebookincubator#2] plus [cpu time: 51.84us, rows: 1024] -> INTEGER [facebookincubator#3] c0 [cpu time: 0ns, rows: 0] -> INTEGER [facebookincubator#4] c1 [cpu time: 0ns, rows: 0] -> INTEGER [facebookincubator#5] 5:BIGINT [cpu time: 0ns, rows: 0] -> BIGINT [facebookincubator#6] mod [cpu time: 49.29us, rows: 1024] -> BIGINT [facebookincubator#7] cast((plus(c0, c1)) as BIGINT) -> BIGINT [CSE facebookincubator#2] 3:BIGINT [cpu time: 0ns, rows: 0] -> BIGINT [facebookincubator#8] ``` Pull Request resolved: facebookincubator#1500 Reviewed By: Yuhta Differential Revision: D35994836 Pulled By: mbasmanova fbshipit-source-id: 6bacbbe61b68dad97ce2fd5f99610c4ad55897be
…cebookincubator#7) Always compile Substrait (facebookincubator#8)
…cebookincubator#7) Always compile Substrait (facebookincubator#8)
…cebookincubator#7) Always compile Substrait (facebookincubator#8)
…cebookincubator#7) Always compile Substrait (facebookincubator#8)
…cebookincubator#7) Always compile Substrait (facebookincubator#8)
…cebookincubator#7) Always compile Substrait (facebookincubator#8)
…cebookincubator#7) Always compile Substrait (facebookincubator#8)
…cebookincubator#7) Always compile Substrait (facebookincubator#8)
…cebookincubator#7) Always compile Substrait (facebookincubator#8)
…cebookincubator#7) Always compile Substrait (facebookincubator#8)
…cebookincubator#7) Always compile Substrait (facebookincubator#8)
…cebookincubator#7) Always compile Substrait (facebookincubator#8)
…cebookincubator#7) Always compile Substrait (facebookincubator#8)
…cebookincubator#7) Always compile Substrait (facebookincubator#8)
Summary: The default algorithm used is MD5. However, MD5 is not supported with fips and can cause a SIGSEGV. Set CRC32 instead which is a standard for checksum computation and is not restricted by fips. crc32 is also faster than md5. Internally at IBM, we hit the following SIGSEGV ``` 0x0000000000000000 in ?? () Missing separate debuginfos, use: dnf debuginfo-install openssl-fips-provider-3.0.7-2.el9.x86_64 xz-libs-5.2.5-8.el9_0.x86_64 (gdb) bt #0 0x0000000000000000 in ?? () #1 0x0000000004e5f89b in Aws::Utils::Crypto::MD5OpenSSLImpl::Calculate(std::istream&) () #2 0x0000000004efd298 in Aws::Utils::Crypto::MD5::Calculate(std::istream&) () #3 0x0000000004ef71b9 in Aws::Utils::HashingUtils::CalculateMD5(std::iostream&) () #4 0x0000000004e8ebe8 in Aws::Client::AWSClient::AddChecksumToRequest(std::shared_ptr<Aws::Http::HttpRequest> const&, Aws::AmazonWebServiceRequest const&) const () #5 0x0000000004e8ed15 in Aws::Client::AWSClient::BuildHttpRequest(Aws::AmazonWebServiceRequest const&, std::shared_ptr<Aws::Http::HttpRequest> const&) const () #6 0x0000000004e977f9 in Aws::Client::AWSClient::AttemptOneRequest(std::shared_ptr<Aws::Http::HttpRequest> const&, Aws::AmazonWebServiceRequest const&, char const*, char const*, char const*) const () #7 0x0000000004e9e1c0 in Aws::Client::AWSClient::AttemptExhaustively(Aws::Http::URI const&, Aws::AmazonWebServiceRequest const&, Aws::Http::HttpMethod, char const*, char const*, char const*) const () #8 0x0000000004ea15e8 in Aws::Client::AWSXMLClient::MakeRequest(Aws::Http::URI const&, Aws::AmazonWebServiceRequest const&, Aws::Http::HttpMethod, char const*, char const*, char const*) const () #9 0x0000000004ea1f70 in Aws::Client::AWSXMLClient::MakeRequest(Aws::AmazonWebServiceRequest const&, Aws::Endpoint::AWSEndpoint const&, Aws::Http::HttpMethod, char const*, char const*, char const*) const () #10 0x0000000004de0933 in Aws::S3::S3Client::UploadPart(Aws::S3::Model::UploadPartRequest const&) const::{lambda()https://github.com/facebookincubator/velox/issues/1}::operator()() const () #11 0x0000000004de0b8c in std::_Function_handler<Aws::Utils::Outcome<Aws::S3::Model::UploadPartResult, Aws::S3::S3Error> (), Aws::S3::S3Client::UploadPart(Aws::S3::Model::UploadPartRequest const&) const::{lambda()https://github.com/facebookincubator/velox/issues/1}>::_M_invoke(std::_Any_data const&) () #12 0x0000000004e19317 in Aws::Utils::Outcome<Aws::S3::Model::UploadPartResult, Aws::S3::S3Error> smithy::components::tracing::TracingUtils::MakeCallWithTiming<Aws::Utils::Outcome<Aws::S3::Model::UploadPartResult, Aws::S3::S3Error> >(std::function<Aws::Utils::Outcome<Aws::S3::Model::UploadPartResult, Aws::S3::S3Error> ()>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, smithy::components::tracing::Meter const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () #13 0x0000000004d7cdcf in Aws::S3::S3Client::UploadPart(Aws::S3::Model::UploadPartRequest const&) const () #14 0x0000000004ca4aa6 in facebook::velox::filesystems::S3WriteFile::Impl::uploadPart (this=0x7fffec2f09a0, part=..., isLast=true) at /root/velox/velox/connectors/hive/storage_adapters/s3fs/S3FileSystem.cpp:380 ``` Pull Request resolved: #10801 Reviewed By: amitkdutta Differential Revision: D61671574 Pulled By: kgpai fbshipit-source-id: 34c7b777b3fde0659ef74c4fbfd93740fdfa3f7c
No description provided.