-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove a quip link in comments #2
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Differential Revision: D30185776 fbshipit-source-id: b1eb615ac560a8ba1ca6e454d138821f90aa8017
facebook-github-bot
added
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
fb-exported
labels
Aug 9, 2021
This pull request was exported from Phabricator. Differential Revision: D30185776 |
mbasmanova
approved these changes
Aug 9, 2021
This pull request has been merged in bcff77a. |
winningsix
pushed a commit
to winningsix/velox-1
that referenced
this pull request
Sep 27, 2021
* Init part for operator framework * Enabled Operator test * Code style fix
winningsix
pushed a commit
to winningsix/velox-1
that referenced
this pull request
Sep 27, 2021
* Init part for operator framework * Enabled Operator test * Code style fix
Closed
ZJie1
added a commit
to ZJie1/velox
that referenced
this pull request
Mar 15, 2022
…bator#2) * Pass project Tests for round-trip trans when batchSize=1 * clean some debug info * change code style and using log instead of cout * Pass project Tests for round-trip plan transform * use full names for more readable in tests * Pass FilterNode Tests for round-trip plan transform * [POAE7-1448] Add AggregateNode, nullValue APIs and pass six tests about transform from velox to substrait * address the comments * [POAE7-1448] pass the round-trip test of aggregatesNode * address comments and update the url of substrait submodule
facebook-github-bot
pushed a commit
that referenced
this pull request
Apr 28, 2022
Summary: Enhance printExprWithStats to identify common-sub expressions. For example, `c0 + c1` is a common sub-expression in `"(c0 + c1) % 5", " (c0 + c1) % 3"` expression set. It is evaluated only once and there is a single Expr object that represents it. That object appears in the expression tree twice. printExprWithStats does not show the runtime stats for second instance of that expression and instead annotates it with `[CSE https://github.com/facebookincubator/velox/issues/2]`, where CSE stands for common sub-expression and 2 refers to the first instance of the expression. ``` mod [cpu time: 50.49us, rows: 1024] -> BIGINT [#1] cast(plus as BIGINT) [cpu time: 68.15us, rows: 1024] -> BIGINT [#2] plus [cpu time: 51.84us, rows: 1024] -> INTEGER [#3] c0 [cpu time: 0ns, rows: 0] -> INTEGER [#4] c1 [cpu time: 0ns, rows: 0] -> INTEGER [#5] 5:BIGINT [cpu time: 0ns, rows: 0] -> BIGINT [#6] mod [cpu time: 49.29us, rows: 1024] -> BIGINT [#7] cast((plus(c0, c1)) as BIGINT) -> BIGINT [CSE #2] 3:BIGINT [cpu time: 0ns, rows: 0] -> BIGINT [#8] ``` Pull Request resolved: #1500 Reviewed By: Yuhta Differential Revision: D35994836 Pulled By: mbasmanova fbshipit-source-id: 6bacbbe61b68dad97ce2fd5f99610c4ad55897be
zhouyuan
pushed a commit
to zhouyuan/velox
that referenced
this pull request
May 11, 2022
Arty-Maly
pushed a commit
to Arty-Maly/velox
that referenced
this pull request
May 13, 2022
…tor#1500) Summary: Enhance printExprWithStats to identify common-sub expressions. For example, `c0 + c1` is a common sub-expression in `"(c0 + c1) % 5", " (c0 + c1) % 3"` expression set. It is evaluated only once and there is a single Expr object that represents it. That object appears in the expression tree twice. printExprWithStats does not show the runtime stats for second instance of that expression and instead annotates it with `[CSE https://github.com/facebookincubator/velox/issues/2]`, where CSE stands for common sub-expression and 2 refers to the first instance of the expression. ``` mod [cpu time: 50.49us, rows: 1024] -> BIGINT [#1] cast(plus as BIGINT) [cpu time: 68.15us, rows: 1024] -> BIGINT [facebookincubator#2] plus [cpu time: 51.84us, rows: 1024] -> INTEGER [facebookincubator#3] c0 [cpu time: 0ns, rows: 0] -> INTEGER [facebookincubator#4] c1 [cpu time: 0ns, rows: 0] -> INTEGER [facebookincubator#5] 5:BIGINT [cpu time: 0ns, rows: 0] -> BIGINT [facebookincubator#6] mod [cpu time: 49.29us, rows: 1024] -> BIGINT [facebookincubator#7] cast((plus(c0, c1)) as BIGINT) -> BIGINT [CSE facebookincubator#2] 3:BIGINT [cpu time: 0ns, rows: 0] -> BIGINT [facebookincubator#8] ``` Pull Request resolved: facebookincubator#1500 Reviewed By: Yuhta Differential Revision: D35994836 Pulled By: mbasmanova fbshipit-source-id: 6bacbbe61b68dad97ce2fd5f99610c4ad55897be
shiyu-bytedance
pushed a commit
to shiyu-bytedance/velox-1
that referenced
this pull request
Aug 18, 2022
…tor#1500) Summary: Enhance printExprWithStats to identify common-sub expressions. For example, `c0 + c1` is a common sub-expression in `"(c0 + c1) % 5", " (c0 + c1) % 3"` expression set. It is evaluated only once and there is a single Expr object that represents it. That object appears in the expression tree twice. printExprWithStats does not show the runtime stats for second instance of that expression and instead annotates it with `[CSE https://github.com/facebookincubator/velox/issues/2]`, where CSE stands for common sub-expression and 2 refers to the first instance of the expression. ``` mod [cpu time: 50.49us, rows: 1024] -> BIGINT [facebookincubator#1] cast(plus as BIGINT) [cpu time: 68.15us, rows: 1024] -> BIGINT [facebookincubator#2] plus [cpu time: 51.84us, rows: 1024] -> INTEGER [facebookincubator#3] c0 [cpu time: 0ns, rows: 0] -> INTEGER [facebookincubator#4] c1 [cpu time: 0ns, rows: 0] -> INTEGER [facebookincubator#5] 5:BIGINT [cpu time: 0ns, rows: 0] -> BIGINT [facebookincubator#6] mod [cpu time: 49.29us, rows: 1024] -> BIGINT [facebookincubator#7] cast((plus(c0, c1)) as BIGINT) -> BIGINT [CSE facebookincubator#2] 3:BIGINT [cpu time: 0ns, rows: 0] -> BIGINT [facebookincubator#8] ``` Pull Request resolved: facebookincubator#1500 Reviewed By: Yuhta Differential Revision: D35994836 Pulled By: mbasmanova fbshipit-source-id: 6bacbbe61b68dad97ce2fd5f99610c4ad55897be
mbasmanova
added a commit
to mbasmanova/velox-1
that referenced
this pull request
Sep 7, 2022
Summary: Fix data race exposed by TSAN: ``` WARNING: ThreadSanitizer: data race (pid=3333333) Write of size 8 at 0x7ffedeec8260 by main thread: facebookincubator#2 DriverTest_yield_Test::TestBody() velox/exec/tests/DriverTest.cpp:478 (velox_exec_test+0x5815ab) Previous read of size 8 at 0x7ffedeec8260 by thread T32: facebookincubator#2 DriverTest_yield_Test::TestBody()::$_3::operator()() const velox/exec/tests/DriverTest.cpp:480 (velox_exec_test+0x599aae) ``` Differential Revision: D39311571 fbshipit-source-id: 4c1f9477a07f8dd2f1e41c52a4ac341e34cf95bb
mbasmanova
added a commit
to mbasmanova/velox-1
that referenced
this pull request
Sep 7, 2022
Summary: Fix data race exposed by TSAN: ``` WARNING: ThreadSanitizer: data race (pid=3539135) Write of size 8 at 0x7ffd6053de08 by main thread: facebookincubator#2 DriverTest_pauserNode_Test::TestBody() velox/exec/tests/DriverTest.cpp:677 (velox_exec_test+0x58205e) Previous read of size 8 at 0x7ffd6053de08 by thread T32: facebookincubator#2 DriverTest_pauserNode_Test::TestBody()::$_6::operator()() const velox/exec/tests/DriverTest.cpp:680 (velox_exec_test+0x5a4441) ``` Differential Revision: D39311986 fbshipit-source-id: f4260d82f18aa434e7d82e5a0f53fe2fca4a0841
facebook-github-bot
pushed a commit
that referenced
this pull request
Sep 7, 2022
Summary: Pull Request resolved: #2465 Fix data race exposed by TSAN: ``` WARNING: ThreadSanitizer: data race (pid=3333333) Write of size 8 at 0x7ffedeec8260 by main thread: #2 DriverTest_yield_Test::TestBody() velox/exec/tests/DriverTest.cpp:478 (velox_exec_test+0x5815ab) Previous read of size 8 at 0x7ffedeec8260 by thread T32: #2 DriverTest_yield_Test::TestBody()::$_3::operator()() const velox/exec/tests/DriverTest.cpp:480 (velox_exec_test+0x599aae) ``` Reviewed By: xiaoxmeng Differential Revision: D39311571 fbshipit-source-id: 1062e97c874d59333d0534ce23660f2fbbd1ae18
facebook-github-bot
pushed a commit
that referenced
this pull request
Sep 7, 2022
Summary: Pull Request resolved: #2466 Fix data race exposed by TSAN: ``` WARNING: ThreadSanitizer: data race (pid=3539135) Write of size 8 at 0x7ffd6053de08 by main thread: #2 DriverTest_pauserNode_Test::TestBody() velox/exec/tests/DriverTest.cpp:677 (velox_exec_test+0x58205e) Previous read of size 8 at 0x7ffd6053de08 by thread T32: #2 DriverTest_pauserNode_Test::TestBody()::$_6::operator()() const velox/exec/tests/DriverTest.cpp:680 (velox_exec_test+0x5a4441) ``` Reviewed By: xiaoxmeng Differential Revision: D39311986 fbshipit-source-id: c7f36bac377f243b879eb7c11bd8159e36b5fbee
pedroerp
added a commit
to pedroerp/velox-1
that referenced
this pull request
Oct 11, 2022
pedroerp
added a commit
to pedroerp/velox-1
that referenced
this pull request
Oct 12, 2022
pedroerp
added a commit
to pedroerp/velox-1
that referenced
this pull request
Oct 12, 2022
pedroerp
added a commit
to pedroerp/velox-1
that referenced
this pull request
Oct 12, 2022
mbasmanova
added a commit
to mbasmanova/velox-1
that referenced
this pull request
Sep 15, 2023
Summary: Pull Request resolved: facebookincubator#6566 array_constructor is very slow: facebookincubator#5958 (comment) array_constructor uses BaseVector::copyRanges, which is somewhat fast for arrays and maps, but very slow for primitive types: ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` FlatVector<T>::copy(source, rows, toSourceRow) is faster. Switching from copyRanges to copy speeds up array_constructor for primitive types and structs significantly. Yet, this change makes arrays and maps slower. The slowness is due to ArrayVector and MapVector not having implementation for copy(source, rows, toSourceRow). They rely on BaseVector::copy to translate rows + toSourceRow to ranges. This extra processing causes perf regression. Hence, we use copy for primitive types and structs of these and copyRanges for everything else. We also optimize FlatVector::copyRanges (which is used by Array/MapVector::copyRanges). ``` Before: array_constructor_ARRAY_nullfree#facebookincubator#1 16.80ms 59.53 array_constructor_ARRAY_nullfree#facebookincubator#2 27.02ms 37.01 array_constructor_ARRAY_nullfree#facebookincubator#3 38.03ms 26.30 array_constructor_ARRAY_nullfree##2_null 52.86ms 18.92 array_constructor_ARRAY_nullfree##2_const 54.97ms 18.19 array_constructor_ARRAY_nulls#facebookincubator#1 30.61ms 32.66 array_constructor_ARRAY_nulls#facebookincubator#2 55.01ms 18.18 array_constructor_ARRAY_nulls#facebookincubator#3 80.69ms 12.39 array_constructor_ARRAY_nulls##2_null 69.10ms 14.47 array_constructor_ARRAY_nulls##2_const 103.85ms 9.63 After: array_constructor_ARRAY_nullfree#facebookincubator#1 15.43ms 64.80 array_constructor_ARRAY_nullfree#facebookincubator#2 24.50ms 40.81 array_constructor_ARRAY_nullfree#facebookincubator#3 35.12ms 28.47 array_constructor_ARRAY_nullfree##2_null 54.52ms 18.34 array_constructor_ARRAY_nullfree##2_const 43.28ms 23.10 array_constructor_ARRAY_nulls#facebookincubator#1 28.60ms 34.96 array_constructor_ARRAY_nulls#facebookincubator#2 50.82ms 19.68 array_constructor_ARRAY_nulls#facebookincubator#3 70.31ms 14.22 array_constructor_ARRAY_nulls##2_null 64.43ms 15.52 array_constructor_ARRAY_nulls##2_const 80.71ms 12.39 Before: array_constructor_INTEGER_nullfree#facebookincubator#1 19.72ms 50.71 array_constructor_INTEGER_nullfree#facebookincubator#2 34.51ms 28.97 array_constructor_INTEGER_nullfree#facebookincubator#3 47.95ms 20.86 array_constructor_INTEGER_nullfree##2_null 58.68ms 17.04 array_constructor_INTEGER_nullfree##2_const 45.15ms 22.15 array_constructor_INTEGER_nulls#facebookincubator#1 29.99ms 33.34 array_constructor_INTEGER_nulls#facebookincubator#2 55.32ms 18.08 array_constructor_INTEGER_nulls#facebookincubator#3 78.53ms 12.73 array_constructor_INTEGER_nulls##2_null 72.24ms 13.84 array_constructor_INTEGER_nulls##2_const 71.13ms 14.06 After: array_constructor_INTEGER_nullfree#facebookincubator#1 3.49ms 286.59 array_constructor_INTEGER_nullfree#facebookincubator#2 7.91ms 126.46 array_constructor_INTEGER_nullfree#facebookincubator#3 11.99ms 83.41 array_constructor_INTEGER_nullfree##2_null 12.57ms 79.55 array_constructor_INTEGER_nullfree##2_const 11.03ms 90.67 array_constructor_INTEGER_nulls#facebookincubator#1 4.37ms 228.97 array_constructor_INTEGER_nulls#facebookincubator#2 9.99ms 100.14 array_constructor_INTEGER_nulls#facebookincubator#3 14.79ms 67.60 array_constructor_INTEGER_nulls##2_null 12.21ms 81.92 array_constructor_INTEGER_nulls##2_const 12.64ms 79.12 Before: array_constructor_MAP_nullfree#facebookincubator#1 17.34ms 57.65 array_constructor_MAP_nullfree#facebookincubator#2 29.84ms 33.51 array_constructor_MAP_nullfree#facebookincubator#3 41.51ms 24.09 array_constructor_MAP_nullfree##2_null 56.57ms 17.68 array_constructor_MAP_nullfree##2_const 71.68ms 13.95 array_constructor_MAP_nulls#facebookincubator#1 36.22ms 27.61 array_constructor_MAP_nulls#facebookincubator#2 68.18ms 14.67 array_constructor_MAP_nulls#facebookincubator#3 95.12ms 10.51 array_constructor_MAP_nulls##2_null 86.42ms 11.57 array_constructor_MAP_nulls##2_const 120.10ms 8.33 After: array_constructor_MAP_nullfree#facebookincubator#1 17.38ms 57.53 array_constructor_MAP_nullfree#facebookincubator#2 29.41ms 34.00 array_constructor_MAP_nullfree#facebookincubator#3 38.30ms 26.11 array_constructor_MAP_nullfree##2_null 58.52ms 17.09 array_constructor_MAP_nullfree##2_const 48.62ms 20.57 array_constructor_MAP_nulls#facebookincubator#1 30.60ms 32.68 array_constructor_MAP_nulls#facebookincubator#2 53.94ms 18.54 array_constructor_MAP_nulls#facebookincubator#3 86.48ms 11.56 array_constructor_MAP_nulls##2_null 69.53ms 14.38 array_constructor_MAP_nulls##2_const 87.56ms 11.42 Before: array_constructor_ROW_nullfree#facebookincubator#1 33.88ms 29.52 array_constructor_ROW_nullfree#facebookincubator#2 62.00ms 16.13 array_constructor_ROW_nullfree#facebookincubator#3 89.54ms 11.17 array_constructor_ROW_nullfree##2_null 78.46ms 12.75 array_constructor_ROW_nullfree##2_const 95.53ms 10.47 array_constructor_ROW_nulls#facebookincubator#1 44.11ms 22.67 array_constructor_ROW_nulls#facebookincubator#2 115.43ms 8.66 array_constructor_ROW_nulls#facebookincubator#3 173.61ms 5.76 array_constructor_ROW_nulls##2_null 130.40ms 7.67 array_constructor_ROW_nulls##2_const 169.97ms 5.88 After: array_constructor_ROW_nullfree#facebookincubator#1 5.64ms 177.44 array_constructor_ROW_nullfree#facebookincubator#2 14.40ms 69.44 array_constructor_ROW_nullfree#facebookincubator#3 21.46ms 46.59 array_constructor_ROW_nullfree##2_null 19.14ms 52.26 array_constructor_ROW_nullfree##2_const 18.60ms 53.77 array_constructor_ROW_nulls#facebookincubator#1 10.97ms 91.18 array_constructor_ROW_nulls#facebookincubator#2 18.29ms 54.67 array_constructor_ROW_nulls#facebookincubator#3 28.57ms 35.01 array_constructor_ROW_nulls##2_null 25.10ms 39.84 array_constructor_ROW_nulls##2_const 24.55ms 40.74 ``` Differential Revision: D49269500 fbshipit-source-id: 1f5ff279be27a55f72930457991cefec0d39fcf4
mbasmanova
added a commit
to mbasmanova/velox-1
that referenced
this pull request
Sep 15, 2023
Summary: Pull Request resolved: facebookincubator#6566 FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree#facebookincubator#1 15.80ms 63.30 array_constructor_ARRAY_nullfree#facebookincubator#2 25.59ms 39.08 array_constructor_ARRAY_nullfree#facebookincubator#3 34.49ms 28.99 array_constructor_ARRAY_nullfree##2_null 58.96ms 16.96 array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls#facebookincubator#1 33.50ms 29.85 array_constructor_ARRAY_nulls#facebookincubator#2 59.05ms 16.93 array_constructor_ARRAY_nulls#facebookincubator#3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree#facebookincubator#1 15.53ms 64.39 array_constructor_ARRAY_nullfree#facebookincubator#2 25.75ms 38.84 array_constructor_ARRAY_nullfree#facebookincubator#3 34.37ms 29.10 array_constructor_ARRAY_nullfree##2_null 59.63ms 16.77 array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls#facebookincubator#1 30.51ms 32.78 array_constructor_ARRAY_nulls#facebookincubator#2 55.13ms 18.14 array_constructor_ARRAY_nulls#facebookincubator#3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree#facebookincubator#1 16.89ms 59.20 array_constructor_MAP_nullfree#facebookincubator#2 29.24ms 34.20 array_constructor_MAP_nullfree#facebookincubator#3 41.11ms 24.33 array_constructor_MAP_nullfree##2_null 61.98ms 16.14 array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls#facebookincubator#1 37.00ms 27.03 array_constructor_MAP_nulls#facebookincubator#2 67.76ms 14.76 array_constructor_MAP_nulls#facebookincubator#3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree#facebookincubator#1 17.07ms 58.57 array_constructor_MAP_nullfree#facebookincubator#2 28.39ms 35.23 array_constructor_MAP_nullfree#facebookincubator#3 42.34ms 23.62 array_constructor_MAP_nullfree##2_null 65.24ms 15.33 array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls#facebookincubator#1 34.34ms 29.12 array_constructor_MAP_nulls#facebookincubator#2 55.23ms 18.11 array_constructor_MAP_nulls#facebookincubator#3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Differential Revision: D49269500 fbshipit-source-id: 5b58c3942224962691e9b7f114f5134513328b5d
facebook-github-bot
pushed a commit
that referenced
this pull request
Sep 15, 2023
Summary: Pull Request resolved: #6568 array_constructor is very slow: #5958 (comment) array_constructor uses BaseVector::copyRanges, which is somewhat fast for arrays and maps, but very slow for primitive types: ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` FlatVector<T>::copy(source, rows, toSourceRow) is faster. Switching from copyRanges to copy speeds up array_constructor for primitive types and structs significantly. Yet, this change makes arrays and maps slower. The slowness is due to ArrayVector and MapVector not having implementation for copy(source, rows, toSourceRow). They rely on BaseVector::copy to translate rows + toSourceRow to ranges. This extra processing causes perf regression. Hence, we use copy for primitive types and structs of these and copyRanges for everything else. ``` Before: array_constructor_ARRAY_nullfree##1 16.80ms 59.53 array_constructor_ARRAY_nullfree##2 27.02ms 37.01 array_constructor_ARRAY_nullfree##3 38.03ms 26.30 array_constructor_ARRAY_nullfree##2_null 52.86ms 18.92 array_constructor_ARRAY_nullfree##2_const 54.97ms 18.19 array_constructor_ARRAY_nulls##1 30.61ms 32.66 array_constructor_ARRAY_nulls##2 55.01ms 18.18 array_constructor_ARRAY_nulls##3 80.69ms 12.39 array_constructor_ARRAY_nulls##2_null 69.10ms 14.47 array_constructor_ARRAY_nulls##2_const 103.85ms 9.63 After: array_constructor_ARRAY_nullfree##1 15.25ms 65.58 array_constructor_ARRAY_nullfree##2 25.11ms 39.82 array_constructor_ARRAY_nullfree##3 34.59ms 28.91 array_constructor_ARRAY_nullfree##2_null 53.61ms 18.65 array_constructor_ARRAY_nullfree##2_const 51.48ms 19.42 array_constructor_ARRAY_nulls##1 29.99ms 33.34 array_constructor_ARRAY_nulls##2 55.91ms 17.89 array_constructor_ARRAY_nulls##3 81.73ms 12.24 array_constructor_ARRAY_nulls##2_null 66.97ms 14.93 array_constructor_ARRAY_nulls##2_const 92.96ms 10.76 Before: array_constructor_INTEGER_nullfree##1 19.72ms 50.71 array_constructor_INTEGER_nullfree##2 34.51ms 28.97 array_constructor_INTEGER_nullfree##3 47.95ms 20.86 array_constructor_INTEGER_nullfree##2_null 58.68ms 17.04 array_constructor_INTEGER_nullfree##2_const 45.15ms 22.15 array_constructor_INTEGER_nulls##1 29.99ms 33.34 array_constructor_INTEGER_nulls##2 55.32ms 18.08 array_constructor_INTEGER_nulls##3 78.53ms 12.73 array_constructor_INTEGER_nulls##2_null 72.24ms 13.84 array_constructor_INTEGER_nulls##2_const 71.13ms 14.06 After: array_constructor_INTEGER_nullfree##1 3.39ms 294.89 array_constructor_INTEGER_nullfree##2 7.35ms 136.10 array_constructor_INTEGER_nullfree##3 10.78ms 92.74 array_constructor_INTEGER_nullfree##2_null 11.29ms 88.57 array_constructor_INTEGER_nullfree##2_const 10.14ms 98.65 array_constructor_INTEGER_nulls##1 4.49ms 222.53 array_constructor_INTEGER_nulls##2 9.78ms 102.29 array_constructor_INTEGER_nulls##3 14.69ms 68.08 array_constructor_INTEGER_nulls##2_null 12.14ms 82.36 array_constructor_INTEGER_nulls##2_const 12.27ms 81.53 Before: array_constructor_MAP_nullfree##1 17.34ms 57.65 array_constructor_MAP_nullfree##2 29.84ms 33.51 array_constructor_MAP_nullfree##3 41.51ms 24.09 array_constructor_MAP_nullfree##2_null 56.57ms 17.68 array_constructor_MAP_nullfree##2_const 71.68ms 13.95 array_constructor_MAP_nulls##1 36.22ms 27.61 array_constructor_MAP_nulls##2 68.18ms 14.67 array_constructor_MAP_nulls##3 95.12ms 10.51 array_constructor_MAP_nulls##2_null 86.42ms 11.57 array_constructor_MAP_nulls##2_const 120.10ms 8.33 After: array_constructor_MAP_nullfree##1 17.05ms 58.66 array_constructor_MAP_nullfree##2 28.42ms 35.18 array_constructor_MAP_nullfree##3 36.96ms 27.06 array_constructor_MAP_nullfree##2_null 55.64ms 17.97 array_constructor_MAP_nullfree##2_const 67.53ms 14.81 array_constructor_MAP_nulls##1 32.91ms 30.39 array_constructor_MAP_nulls##2 64.50ms 15.50 array_constructor_MAP_nulls##3 95.71ms 10.45 array_constructor_MAP_nulls##2_null 77.22ms 12.95 array_constructor_MAP_nulls##2_const 114.91ms 8.70 Before: array_constructor_ROW_nullfree##1 33.88ms 29.52 array_constructor_ROW_nullfree##2 62.00ms 16.13 array_constructor_ROW_nullfree##3 89.54ms 11.17 array_constructor_ROW_nullfree##2_null 78.46ms 12.75 array_constructor_ROW_nullfree##2_const 95.53ms 10.47 array_constructor_ROW_nulls##1 44.11ms 22.67 array_constructor_ROW_nulls##2 115.43ms 8.66 array_constructor_ROW_nulls##3 173.61ms 5.76 array_constructor_ROW_nulls##2_null 130.40ms 7.67 array_constructor_ROW_nulls##2_const 169.97ms 5.88 After: array_constructor_ROW_nullfree##1 5.55ms 180.15 array_constructor_ROW_nullfree##2 12.83ms 77.94 array_constructor_ROW_nullfree##3 18.89ms 52.95 array_constructor_ROW_nullfree##2_null 18.74ms 53.36 array_constructor_ROW_nullfree##2_const 18.16ms 55.07 array_constructor_ROW_nulls##1 11.29ms 88.61 array_constructor_ROW_nulls##2 18.57ms 53.86 array_constructor_ROW_nulls##3 34.20ms 29.24 array_constructor_ROW_nulls##2_null 25.05ms 39.92 array_constructor_ROW_nulls##2_const 25.15ms 39.77 ``` Reviewed By: laithsakka Differential Revision: D49272797 fbshipit-source-id: 55d83de7b69c7ae4b72b5a5ae62a7868f36b0e19
mbasmanova
added a commit
to mbasmanova/velox-1
that referenced
this pull request
Sep 16, 2023
Summary: FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls#facebookincubator#1 33.50ms 29.85 array_constructor_ARRAY_nulls#facebookincubator#2 59.05ms 16.93 array_constructor_ARRAY_nulls#facebookincubator#3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls#facebookincubator#1 30.51ms 32.78 array_constructor_ARRAY_nulls#facebookincubator#2 55.13ms 18.14 array_constructor_ARRAY_nulls#facebookincubator#3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls#facebookincubator#1 37.00ms 27.03 array_constructor_MAP_nulls#facebookincubator#2 67.76ms 14.76 array_constructor_MAP_nulls#facebookincubator#3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls#facebookincubator#1 34.34ms 29.12 array_constructor_MAP_nulls#facebookincubator#2 55.23ms 18.11 array_constructor_MAP_nulls#facebookincubator#3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Differential Revision: D49269500
mbasmanova
added a commit
to mbasmanova/velox-1
that referenced
this pull request
Sep 16, 2023
Summary: FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls#facebookincubator#1 33.50ms 29.85 array_constructor_ARRAY_nulls#facebookincubator#2 59.05ms 16.93 array_constructor_ARRAY_nulls#facebookincubator#3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls#facebookincubator#1 30.51ms 32.78 array_constructor_ARRAY_nulls#facebookincubator#2 55.13ms 18.14 array_constructor_ARRAY_nulls#facebookincubator#3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls#facebookincubator#1 37.00ms 27.03 array_constructor_MAP_nulls#facebookincubator#2 67.76ms 14.76 array_constructor_MAP_nulls#facebookincubator#3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls#facebookincubator#1 34.34ms 29.12 array_constructor_MAP_nulls#facebookincubator#2 55.23ms 18.11 array_constructor_MAP_nulls#facebookincubator#3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Differential Revision: D49269500
mbasmanova
added a commit
to mbasmanova/velox-1
that referenced
this pull request
Sep 16, 2023
Summary: FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls#facebookincubator#1 33.50ms 29.85 array_constructor_ARRAY_nulls#facebookincubator#2 59.05ms 16.93 array_constructor_ARRAY_nulls#facebookincubator#3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls#facebookincubator#1 30.51ms 32.78 array_constructor_ARRAY_nulls#facebookincubator#2 55.13ms 18.14 array_constructor_ARRAY_nulls#facebookincubator#3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls#facebookincubator#1 37.00ms 27.03 array_constructor_MAP_nulls#facebookincubator#2 67.76ms 14.76 array_constructor_MAP_nulls#facebookincubator#3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls#facebookincubator#1 34.34ms 29.12 array_constructor_MAP_nulls#facebookincubator#2 55.23ms 18.11 array_constructor_MAP_nulls#facebookincubator#3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Differential Revision: D49269500
mbasmanova
added a commit
to mbasmanova/velox-1
that referenced
this pull request
Sep 16, 2023
Summary: FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls#facebookincubator#1 33.50ms 29.85 array_constructor_ARRAY_nulls#facebookincubator#2 59.05ms 16.93 array_constructor_ARRAY_nulls#facebookincubator#3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls#facebookincubator#1 30.51ms 32.78 array_constructor_ARRAY_nulls#facebookincubator#2 55.13ms 18.14 array_constructor_ARRAY_nulls#facebookincubator#3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls#facebookincubator#1 37.00ms 27.03 array_constructor_MAP_nulls#facebookincubator#2 67.76ms 14.76 array_constructor_MAP_nulls#facebookincubator#3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls#facebookincubator#1 34.34ms 29.12 array_constructor_MAP_nulls#facebookincubator#2 55.23ms 18.11 array_constructor_MAP_nulls#facebookincubator#3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Differential Revision: D49269500
mbasmanova
added a commit
to mbasmanova/velox-1
that referenced
this pull request
Sep 18, 2023
Summary: FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls#facebookincubator#1 33.50ms 29.85 array_constructor_ARRAY_nulls#facebookincubator#2 59.05ms 16.93 array_constructor_ARRAY_nulls#facebookincubator#3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls#facebookincubator#1 30.51ms 32.78 array_constructor_ARRAY_nulls#facebookincubator#2 55.13ms 18.14 array_constructor_ARRAY_nulls#facebookincubator#3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls#facebookincubator#1 37.00ms 27.03 array_constructor_MAP_nulls#facebookincubator#2 67.76ms 14.76 array_constructor_MAP_nulls#facebookincubator#3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls#facebookincubator#1 34.34ms 29.12 array_constructor_MAP_nulls#facebookincubator#2 55.23ms 18.11 array_constructor_MAP_nulls#facebookincubator#3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Differential Revision: D49269500
mbasmanova
added a commit
to mbasmanova/velox-1
that referenced
this pull request
Sep 18, 2023
Summary: FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls#facebookincubator#1 33.50ms 29.85 array_constructor_ARRAY_nulls#facebookincubator#2 59.05ms 16.93 array_constructor_ARRAY_nulls#facebookincubator#3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls#facebookincubator#1 30.51ms 32.78 array_constructor_ARRAY_nulls#facebookincubator#2 55.13ms 18.14 array_constructor_ARRAY_nulls#facebookincubator#3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls#facebookincubator#1 37.00ms 27.03 array_constructor_MAP_nulls#facebookincubator#2 67.76ms 14.76 array_constructor_MAP_nulls#facebookincubator#3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls#facebookincubator#1 34.34ms 29.12 array_constructor_MAP_nulls#facebookincubator#2 55.23ms 18.11 array_constructor_MAP_nulls#facebookincubator#3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Differential Revision: D49269500
mbasmanova
added a commit
to mbasmanova/velox-1
that referenced
this pull request
Sep 18, 2023
Summary: Pull Request resolved: facebookincubator#6566 FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls#facebookincubator#1 33.50ms 29.85 array_constructor_ARRAY_nulls#facebookincubator#2 59.05ms 16.93 array_constructor_ARRAY_nulls#facebookincubator#3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls#facebookincubator#1 30.51ms 32.78 array_constructor_ARRAY_nulls#facebookincubator#2 55.13ms 18.14 array_constructor_ARRAY_nulls#facebookincubator#3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls#facebookincubator#1 37.00ms 27.03 array_constructor_MAP_nulls#facebookincubator#2 67.76ms 14.76 array_constructor_MAP_nulls#facebookincubator#3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls#facebookincubator#1 34.34ms 29.12 array_constructor_MAP_nulls#facebookincubator#2 55.23ms 18.11 array_constructor_MAP_nulls#facebookincubator#3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Differential Revision: D49269500 fbshipit-source-id: 8fc304cce31f782aef818e6d8ce052360af8e832
mbasmanova
added a commit
to mbasmanova/velox-1
that referenced
this pull request
Sep 18, 2023
Summary: Pull Request resolved: facebookincubator#6566 FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls#facebookincubator#1 33.50ms 29.85 array_constructor_ARRAY_nulls#facebookincubator#2 59.05ms 16.93 array_constructor_ARRAY_nulls#facebookincubator#3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls#facebookincubator#1 30.51ms 32.78 array_constructor_ARRAY_nulls#facebookincubator#2 55.13ms 18.14 array_constructor_ARRAY_nulls#facebookincubator#3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls#facebookincubator#1 37.00ms 27.03 array_constructor_MAP_nulls#facebookincubator#2 67.76ms 14.76 array_constructor_MAP_nulls#facebookincubator#3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls#facebookincubator#1 34.34ms 29.12 array_constructor_MAP_nulls#facebookincubator#2 55.23ms 18.11 array_constructor_MAP_nulls#facebookincubator#3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Differential Revision: D49269500 fbshipit-source-id: 054dcec94cc3768443440ffb57d58c96f9b1f006
mbasmanova
added a commit
to mbasmanova/velox-1
that referenced
this pull request
Sep 18, 2023
Summary: Pull Request resolved: facebookincubator#6566 FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls#facebookincubator#1 33.50ms 29.85 array_constructor_ARRAY_nulls#facebookincubator#2 59.05ms 16.93 array_constructor_ARRAY_nulls#facebookincubator#3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls#facebookincubator#1 30.51ms 32.78 array_constructor_ARRAY_nulls#facebookincubator#2 55.13ms 18.14 array_constructor_ARRAY_nulls#facebookincubator#3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls#facebookincubator#1 37.00ms 27.03 array_constructor_MAP_nulls#facebookincubator#2 67.76ms 14.76 array_constructor_MAP_nulls#facebookincubator#3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls#facebookincubator#1 34.34ms 29.12 array_constructor_MAP_nulls#facebookincubator#2 55.23ms 18.11 array_constructor_MAP_nulls#facebookincubator#3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Reviewed By: laithsakka Differential Revision: D49269500 fbshipit-source-id: 2fdf6f2a717a3bc4bf54fb3d9bd970157a8c415b
mbasmanova
added a commit
to mbasmanova/velox-1
that referenced
this pull request
Sep 18, 2023
Summary: Pull Request resolved: facebookincubator#6566 FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls#facebookincubator#1 33.50ms 29.85 array_constructor_ARRAY_nulls#facebookincubator#2 59.05ms 16.93 array_constructor_ARRAY_nulls#facebookincubator#3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls#facebookincubator#1 30.51ms 32.78 array_constructor_ARRAY_nulls#facebookincubator#2 55.13ms 18.14 array_constructor_ARRAY_nulls#facebookincubator#3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls#facebookincubator#1 37.00ms 27.03 array_constructor_MAP_nulls#facebookincubator#2 67.76ms 14.76 array_constructor_MAP_nulls#facebookincubator#3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls#facebookincubator#1 34.34ms 29.12 array_constructor_MAP_nulls#facebookincubator#2 55.23ms 18.11 array_constructor_MAP_nulls#facebookincubator#3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Reviewed By: laithsakka Differential Revision: D49269500 fbshipit-source-id: b7271a981d33885b2d4bc3219bc15f069bcd06d3
facebook-github-bot
pushed a commit
that referenced
this pull request
Sep 19, 2023
Summary: Pull Request resolved: #6566 FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: #5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls##1 33.50ms 29.85 array_constructor_ARRAY_nulls##2 59.05ms 16.93 array_constructor_ARRAY_nulls##3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls##1 30.51ms 32.78 array_constructor_ARRAY_nulls##2 55.13ms 18.14 array_constructor_ARRAY_nulls##3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls##1 37.00ms 27.03 array_constructor_MAP_nulls##2 67.76ms 14.76 array_constructor_MAP_nulls##3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls##1 34.34ms 29.12 array_constructor_MAP_nulls##2 55.23ms 18.11 array_constructor_MAP_nulls##3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Reviewed By: laithsakka Differential Revision: D49269500 fbshipit-source-id: 7b702921202f4bb8d10a252eb0ab20f0e5792ae6
codyschierbeck
pushed a commit
to codyschierbeck/velox
that referenced
this pull request
Sep 27, 2023
Summary: Pull Request resolved: facebookincubator#6568 array_constructor is very slow: facebookincubator#5958 (comment) array_constructor uses BaseVector::copyRanges, which is somewhat fast for arrays and maps, but very slow for primitive types: ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` FlatVector<T>::copy(source, rows, toSourceRow) is faster. Switching from copyRanges to copy speeds up array_constructor for primitive types and structs significantly. Yet, this change makes arrays and maps slower. The slowness is due to ArrayVector and MapVector not having implementation for copy(source, rows, toSourceRow). They rely on BaseVector::copy to translate rows + toSourceRow to ranges. This extra processing causes perf regression. Hence, we use copy for primitive types and structs of these and copyRanges for everything else. ``` Before: array_constructor_ARRAY_nullfree#facebookincubator#1 16.80ms 59.53 array_constructor_ARRAY_nullfree#facebookincubator#2 27.02ms 37.01 array_constructor_ARRAY_nullfree#facebookincubator#3 38.03ms 26.30 array_constructor_ARRAY_nullfree##2_null 52.86ms 18.92 array_constructor_ARRAY_nullfree##2_const 54.97ms 18.19 array_constructor_ARRAY_nulls#facebookincubator#1 30.61ms 32.66 array_constructor_ARRAY_nulls#facebookincubator#2 55.01ms 18.18 array_constructor_ARRAY_nulls#facebookincubator#3 80.69ms 12.39 array_constructor_ARRAY_nulls##2_null 69.10ms 14.47 array_constructor_ARRAY_nulls##2_const 103.85ms 9.63 After: array_constructor_ARRAY_nullfree#facebookincubator#1 15.25ms 65.58 array_constructor_ARRAY_nullfree#facebookincubator#2 25.11ms 39.82 array_constructor_ARRAY_nullfree#facebookincubator#3 34.59ms 28.91 array_constructor_ARRAY_nullfree##2_null 53.61ms 18.65 array_constructor_ARRAY_nullfree##2_const 51.48ms 19.42 array_constructor_ARRAY_nulls#facebookincubator#1 29.99ms 33.34 array_constructor_ARRAY_nulls#facebookincubator#2 55.91ms 17.89 array_constructor_ARRAY_nulls#facebookincubator#3 81.73ms 12.24 array_constructor_ARRAY_nulls##2_null 66.97ms 14.93 array_constructor_ARRAY_nulls##2_const 92.96ms 10.76 Before: array_constructor_INTEGER_nullfree#facebookincubator#1 19.72ms 50.71 array_constructor_INTEGER_nullfree#facebookincubator#2 34.51ms 28.97 array_constructor_INTEGER_nullfree#facebookincubator#3 47.95ms 20.86 array_constructor_INTEGER_nullfree##2_null 58.68ms 17.04 array_constructor_INTEGER_nullfree##2_const 45.15ms 22.15 array_constructor_INTEGER_nulls#facebookincubator#1 29.99ms 33.34 array_constructor_INTEGER_nulls#facebookincubator#2 55.32ms 18.08 array_constructor_INTEGER_nulls#facebookincubator#3 78.53ms 12.73 array_constructor_INTEGER_nulls##2_null 72.24ms 13.84 array_constructor_INTEGER_nulls##2_const 71.13ms 14.06 After: array_constructor_INTEGER_nullfree#facebookincubator#1 3.39ms 294.89 array_constructor_INTEGER_nullfree#facebookincubator#2 7.35ms 136.10 array_constructor_INTEGER_nullfree#facebookincubator#3 10.78ms 92.74 array_constructor_INTEGER_nullfree##2_null 11.29ms 88.57 array_constructor_INTEGER_nullfree##2_const 10.14ms 98.65 array_constructor_INTEGER_nulls#facebookincubator#1 4.49ms 222.53 array_constructor_INTEGER_nulls#facebookincubator#2 9.78ms 102.29 array_constructor_INTEGER_nulls#facebookincubator#3 14.69ms 68.08 array_constructor_INTEGER_nulls##2_null 12.14ms 82.36 array_constructor_INTEGER_nulls##2_const 12.27ms 81.53 Before: array_constructor_MAP_nullfree#facebookincubator#1 17.34ms 57.65 array_constructor_MAP_nullfree#facebookincubator#2 29.84ms 33.51 array_constructor_MAP_nullfree#facebookincubator#3 41.51ms 24.09 array_constructor_MAP_nullfree##2_null 56.57ms 17.68 array_constructor_MAP_nullfree##2_const 71.68ms 13.95 array_constructor_MAP_nulls#facebookincubator#1 36.22ms 27.61 array_constructor_MAP_nulls#facebookincubator#2 68.18ms 14.67 array_constructor_MAP_nulls#facebookincubator#3 95.12ms 10.51 array_constructor_MAP_nulls##2_null 86.42ms 11.57 array_constructor_MAP_nulls##2_const 120.10ms 8.33 After: array_constructor_MAP_nullfree#facebookincubator#1 17.05ms 58.66 array_constructor_MAP_nullfree#facebookincubator#2 28.42ms 35.18 array_constructor_MAP_nullfree#facebookincubator#3 36.96ms 27.06 array_constructor_MAP_nullfree##2_null 55.64ms 17.97 array_constructor_MAP_nullfree##2_const 67.53ms 14.81 array_constructor_MAP_nulls#facebookincubator#1 32.91ms 30.39 array_constructor_MAP_nulls#facebookincubator#2 64.50ms 15.50 array_constructor_MAP_nulls#facebookincubator#3 95.71ms 10.45 array_constructor_MAP_nulls##2_null 77.22ms 12.95 array_constructor_MAP_nulls##2_const 114.91ms 8.70 Before: array_constructor_ROW_nullfree#facebookincubator#1 33.88ms 29.52 array_constructor_ROW_nullfree#facebookincubator#2 62.00ms 16.13 array_constructor_ROW_nullfree#facebookincubator#3 89.54ms 11.17 array_constructor_ROW_nullfree##2_null 78.46ms 12.75 array_constructor_ROW_nullfree##2_const 95.53ms 10.47 array_constructor_ROW_nulls#facebookincubator#1 44.11ms 22.67 array_constructor_ROW_nulls#facebookincubator#2 115.43ms 8.66 array_constructor_ROW_nulls#facebookincubator#3 173.61ms 5.76 array_constructor_ROW_nulls##2_null 130.40ms 7.67 array_constructor_ROW_nulls##2_const 169.97ms 5.88 After: array_constructor_ROW_nullfree#facebookincubator#1 5.55ms 180.15 array_constructor_ROW_nullfree#facebookincubator#2 12.83ms 77.94 array_constructor_ROW_nullfree#facebookincubator#3 18.89ms 52.95 array_constructor_ROW_nullfree##2_null 18.74ms 53.36 array_constructor_ROW_nullfree##2_const 18.16ms 55.07 array_constructor_ROW_nulls#facebookincubator#1 11.29ms 88.61 array_constructor_ROW_nulls#facebookincubator#2 18.57ms 53.86 array_constructor_ROW_nulls#facebookincubator#3 34.20ms 29.24 array_constructor_ROW_nulls##2_null 25.05ms 39.92 array_constructor_ROW_nulls##2_const 25.15ms 39.77 ``` Reviewed By: laithsakka Differential Revision: D49272797 fbshipit-source-id: 55d83de7b69c7ae4b72b5a5ae62a7868f36b0e19
codyschierbeck
pushed a commit
to codyschierbeck/velox
that referenced
this pull request
Sep 27, 2023
Summary: Pull Request resolved: facebookincubator#6566 FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls#facebookincubator#1 33.50ms 29.85 array_constructor_ARRAY_nulls#facebookincubator#2 59.05ms 16.93 array_constructor_ARRAY_nulls#facebookincubator#3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls#facebookincubator#1 30.51ms 32.78 array_constructor_ARRAY_nulls#facebookincubator#2 55.13ms 18.14 array_constructor_ARRAY_nulls#facebookincubator#3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls#facebookincubator#1 37.00ms 27.03 array_constructor_MAP_nulls#facebookincubator#2 67.76ms 14.76 array_constructor_MAP_nulls#facebookincubator#3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls#facebookincubator#1 34.34ms 29.12 array_constructor_MAP_nulls#facebookincubator#2 55.23ms 18.11 array_constructor_MAP_nulls#facebookincubator#3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Reviewed By: laithsakka Differential Revision: D49269500 fbshipit-source-id: 7b702921202f4bb8d10a252eb0ab20f0e5792ae6
codyschierbeck
pushed a commit
to codyschierbeck/velox
that referenced
this pull request
Sep 27, 2023
Summary: Pull Request resolved: facebookincubator#6568 array_constructor is very slow: facebookincubator#5958 (comment) array_constructor uses BaseVector::copyRanges, which is somewhat fast for arrays and maps, but very slow for primitive types: ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` FlatVector<T>::copy(source, rows, toSourceRow) is faster. Switching from copyRanges to copy speeds up array_constructor for primitive types and structs significantly. Yet, this change makes arrays and maps slower. The slowness is due to ArrayVector and MapVector not having implementation for copy(source, rows, toSourceRow). They rely on BaseVector::copy to translate rows + toSourceRow to ranges. This extra processing causes perf regression. Hence, we use copy for primitive types and structs of these and copyRanges for everything else. ``` Before: array_constructor_ARRAY_nullfree#facebookincubator#1 16.80ms 59.53 array_constructor_ARRAY_nullfree#facebookincubator#2 27.02ms 37.01 array_constructor_ARRAY_nullfree#facebookincubator#3 38.03ms 26.30 array_constructor_ARRAY_nullfree##2_null 52.86ms 18.92 array_constructor_ARRAY_nullfree##2_const 54.97ms 18.19 array_constructor_ARRAY_nulls#facebookincubator#1 30.61ms 32.66 array_constructor_ARRAY_nulls#facebookincubator#2 55.01ms 18.18 array_constructor_ARRAY_nulls#facebookincubator#3 80.69ms 12.39 array_constructor_ARRAY_nulls##2_null 69.10ms 14.47 array_constructor_ARRAY_nulls##2_const 103.85ms 9.63 After: array_constructor_ARRAY_nullfree#facebookincubator#1 15.25ms 65.58 array_constructor_ARRAY_nullfree#facebookincubator#2 25.11ms 39.82 array_constructor_ARRAY_nullfree#facebookincubator#3 34.59ms 28.91 array_constructor_ARRAY_nullfree##2_null 53.61ms 18.65 array_constructor_ARRAY_nullfree##2_const 51.48ms 19.42 array_constructor_ARRAY_nulls#facebookincubator#1 29.99ms 33.34 array_constructor_ARRAY_nulls#facebookincubator#2 55.91ms 17.89 array_constructor_ARRAY_nulls#facebookincubator#3 81.73ms 12.24 array_constructor_ARRAY_nulls##2_null 66.97ms 14.93 array_constructor_ARRAY_nulls##2_const 92.96ms 10.76 Before: array_constructor_INTEGER_nullfree#facebookincubator#1 19.72ms 50.71 array_constructor_INTEGER_nullfree#facebookincubator#2 34.51ms 28.97 array_constructor_INTEGER_nullfree#facebookincubator#3 47.95ms 20.86 array_constructor_INTEGER_nullfree##2_null 58.68ms 17.04 array_constructor_INTEGER_nullfree##2_const 45.15ms 22.15 array_constructor_INTEGER_nulls#facebookincubator#1 29.99ms 33.34 array_constructor_INTEGER_nulls#facebookincubator#2 55.32ms 18.08 array_constructor_INTEGER_nulls#facebookincubator#3 78.53ms 12.73 array_constructor_INTEGER_nulls##2_null 72.24ms 13.84 array_constructor_INTEGER_nulls##2_const 71.13ms 14.06 After: array_constructor_INTEGER_nullfree#facebookincubator#1 3.39ms 294.89 array_constructor_INTEGER_nullfree#facebookincubator#2 7.35ms 136.10 array_constructor_INTEGER_nullfree#facebookincubator#3 10.78ms 92.74 array_constructor_INTEGER_nullfree##2_null 11.29ms 88.57 array_constructor_INTEGER_nullfree##2_const 10.14ms 98.65 array_constructor_INTEGER_nulls#facebookincubator#1 4.49ms 222.53 array_constructor_INTEGER_nulls#facebookincubator#2 9.78ms 102.29 array_constructor_INTEGER_nulls#facebookincubator#3 14.69ms 68.08 array_constructor_INTEGER_nulls##2_null 12.14ms 82.36 array_constructor_INTEGER_nulls##2_const 12.27ms 81.53 Before: array_constructor_MAP_nullfree#facebookincubator#1 17.34ms 57.65 array_constructor_MAP_nullfree#facebookincubator#2 29.84ms 33.51 array_constructor_MAP_nullfree#facebookincubator#3 41.51ms 24.09 array_constructor_MAP_nullfree##2_null 56.57ms 17.68 array_constructor_MAP_nullfree##2_const 71.68ms 13.95 array_constructor_MAP_nulls#facebookincubator#1 36.22ms 27.61 array_constructor_MAP_nulls#facebookincubator#2 68.18ms 14.67 array_constructor_MAP_nulls#facebookincubator#3 95.12ms 10.51 array_constructor_MAP_nulls##2_null 86.42ms 11.57 array_constructor_MAP_nulls##2_const 120.10ms 8.33 After: array_constructor_MAP_nullfree#facebookincubator#1 17.05ms 58.66 array_constructor_MAP_nullfree#facebookincubator#2 28.42ms 35.18 array_constructor_MAP_nullfree#facebookincubator#3 36.96ms 27.06 array_constructor_MAP_nullfree##2_null 55.64ms 17.97 array_constructor_MAP_nullfree##2_const 67.53ms 14.81 array_constructor_MAP_nulls#facebookincubator#1 32.91ms 30.39 array_constructor_MAP_nulls#facebookincubator#2 64.50ms 15.50 array_constructor_MAP_nulls#facebookincubator#3 95.71ms 10.45 array_constructor_MAP_nulls##2_null 77.22ms 12.95 array_constructor_MAP_nulls##2_const 114.91ms 8.70 Before: array_constructor_ROW_nullfree#facebookincubator#1 33.88ms 29.52 array_constructor_ROW_nullfree#facebookincubator#2 62.00ms 16.13 array_constructor_ROW_nullfree#facebookincubator#3 89.54ms 11.17 array_constructor_ROW_nullfree##2_null 78.46ms 12.75 array_constructor_ROW_nullfree##2_const 95.53ms 10.47 array_constructor_ROW_nulls#facebookincubator#1 44.11ms 22.67 array_constructor_ROW_nulls#facebookincubator#2 115.43ms 8.66 array_constructor_ROW_nulls#facebookincubator#3 173.61ms 5.76 array_constructor_ROW_nulls##2_null 130.40ms 7.67 array_constructor_ROW_nulls##2_const 169.97ms 5.88 After: array_constructor_ROW_nullfree#facebookincubator#1 5.55ms 180.15 array_constructor_ROW_nullfree#facebookincubator#2 12.83ms 77.94 array_constructor_ROW_nullfree#facebookincubator#3 18.89ms 52.95 array_constructor_ROW_nullfree##2_null 18.74ms 53.36 array_constructor_ROW_nullfree##2_const 18.16ms 55.07 array_constructor_ROW_nulls#facebookincubator#1 11.29ms 88.61 array_constructor_ROW_nulls#facebookincubator#2 18.57ms 53.86 array_constructor_ROW_nulls#facebookincubator#3 34.20ms 29.24 array_constructor_ROW_nulls##2_null 25.05ms 39.92 array_constructor_ROW_nulls##2_const 25.15ms 39.77 ``` Reviewed By: laithsakka Differential Revision: D49272797 fbshipit-source-id: 55d83de7b69c7ae4b72b5a5ae62a7868f36b0e19
codyschierbeck
pushed a commit
to codyschierbeck/velox
that referenced
this pull request
Sep 27, 2023
Summary: Pull Request resolved: facebookincubator#6566 FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls#facebookincubator#1 33.50ms 29.85 array_constructor_ARRAY_nulls#facebookincubator#2 59.05ms 16.93 array_constructor_ARRAY_nulls#facebookincubator#3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls#facebookincubator#1 30.51ms 32.78 array_constructor_ARRAY_nulls#facebookincubator#2 55.13ms 18.14 array_constructor_ARRAY_nulls#facebookincubator#3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls#facebookincubator#1 37.00ms 27.03 array_constructor_MAP_nulls#facebookincubator#2 67.76ms 14.76 array_constructor_MAP_nulls#facebookincubator#3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls#facebookincubator#1 34.34ms 29.12 array_constructor_MAP_nulls#facebookincubator#2 55.23ms 18.11 array_constructor_MAP_nulls#facebookincubator#3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Reviewed By: laithsakka Differential Revision: D49269500 fbshipit-source-id: 7b702921202f4bb8d10a252eb0ab20f0e5792ae6
codyschierbeck
pushed a commit
to codyschierbeck/velox
that referenced
this pull request
Sep 27, 2023
Summary: Pull Request resolved: facebookincubator#6568 array_constructor is very slow: facebookincubator#5958 (comment) array_constructor uses BaseVector::copyRanges, which is somewhat fast for arrays and maps, but very slow for primitive types: ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` FlatVector<T>::copy(source, rows, toSourceRow) is faster. Switching from copyRanges to copy speeds up array_constructor for primitive types and structs significantly. Yet, this change makes arrays and maps slower. The slowness is due to ArrayVector and MapVector not having implementation for copy(source, rows, toSourceRow). They rely on BaseVector::copy to translate rows + toSourceRow to ranges. This extra processing causes perf regression. Hence, we use copy for primitive types and structs of these and copyRanges for everything else. ``` Before: array_constructor_ARRAY_nullfree#facebookincubator#1 16.80ms 59.53 array_constructor_ARRAY_nullfree#facebookincubator#2 27.02ms 37.01 array_constructor_ARRAY_nullfree#facebookincubator#3 38.03ms 26.30 array_constructor_ARRAY_nullfree##2_null 52.86ms 18.92 array_constructor_ARRAY_nullfree##2_const 54.97ms 18.19 array_constructor_ARRAY_nulls#facebookincubator#1 30.61ms 32.66 array_constructor_ARRAY_nulls#facebookincubator#2 55.01ms 18.18 array_constructor_ARRAY_nulls#facebookincubator#3 80.69ms 12.39 array_constructor_ARRAY_nulls##2_null 69.10ms 14.47 array_constructor_ARRAY_nulls##2_const 103.85ms 9.63 After: array_constructor_ARRAY_nullfree#facebookincubator#1 15.25ms 65.58 array_constructor_ARRAY_nullfree#facebookincubator#2 25.11ms 39.82 array_constructor_ARRAY_nullfree#facebookincubator#3 34.59ms 28.91 array_constructor_ARRAY_nullfree##2_null 53.61ms 18.65 array_constructor_ARRAY_nullfree##2_const 51.48ms 19.42 array_constructor_ARRAY_nulls#facebookincubator#1 29.99ms 33.34 array_constructor_ARRAY_nulls#facebookincubator#2 55.91ms 17.89 array_constructor_ARRAY_nulls#facebookincubator#3 81.73ms 12.24 array_constructor_ARRAY_nulls##2_null 66.97ms 14.93 array_constructor_ARRAY_nulls##2_const 92.96ms 10.76 Before: array_constructor_INTEGER_nullfree#facebookincubator#1 19.72ms 50.71 array_constructor_INTEGER_nullfree#facebookincubator#2 34.51ms 28.97 array_constructor_INTEGER_nullfree#facebookincubator#3 47.95ms 20.86 array_constructor_INTEGER_nullfree##2_null 58.68ms 17.04 array_constructor_INTEGER_nullfree##2_const 45.15ms 22.15 array_constructor_INTEGER_nulls#facebookincubator#1 29.99ms 33.34 array_constructor_INTEGER_nulls#facebookincubator#2 55.32ms 18.08 array_constructor_INTEGER_nulls#facebookincubator#3 78.53ms 12.73 array_constructor_INTEGER_nulls##2_null 72.24ms 13.84 array_constructor_INTEGER_nulls##2_const 71.13ms 14.06 After: array_constructor_INTEGER_nullfree#facebookincubator#1 3.39ms 294.89 array_constructor_INTEGER_nullfree#facebookincubator#2 7.35ms 136.10 array_constructor_INTEGER_nullfree#facebookincubator#3 10.78ms 92.74 array_constructor_INTEGER_nullfree##2_null 11.29ms 88.57 array_constructor_INTEGER_nullfree##2_const 10.14ms 98.65 array_constructor_INTEGER_nulls#facebookincubator#1 4.49ms 222.53 array_constructor_INTEGER_nulls#facebookincubator#2 9.78ms 102.29 array_constructor_INTEGER_nulls#facebookincubator#3 14.69ms 68.08 array_constructor_INTEGER_nulls##2_null 12.14ms 82.36 array_constructor_INTEGER_nulls##2_const 12.27ms 81.53 Before: array_constructor_MAP_nullfree#facebookincubator#1 17.34ms 57.65 array_constructor_MAP_nullfree#facebookincubator#2 29.84ms 33.51 array_constructor_MAP_nullfree#facebookincubator#3 41.51ms 24.09 array_constructor_MAP_nullfree##2_null 56.57ms 17.68 array_constructor_MAP_nullfree##2_const 71.68ms 13.95 array_constructor_MAP_nulls#facebookincubator#1 36.22ms 27.61 array_constructor_MAP_nulls#facebookincubator#2 68.18ms 14.67 array_constructor_MAP_nulls#facebookincubator#3 95.12ms 10.51 array_constructor_MAP_nulls##2_null 86.42ms 11.57 array_constructor_MAP_nulls##2_const 120.10ms 8.33 After: array_constructor_MAP_nullfree#facebookincubator#1 17.05ms 58.66 array_constructor_MAP_nullfree#facebookincubator#2 28.42ms 35.18 array_constructor_MAP_nullfree#facebookincubator#3 36.96ms 27.06 array_constructor_MAP_nullfree##2_null 55.64ms 17.97 array_constructor_MAP_nullfree##2_const 67.53ms 14.81 array_constructor_MAP_nulls#facebookincubator#1 32.91ms 30.39 array_constructor_MAP_nulls#facebookincubator#2 64.50ms 15.50 array_constructor_MAP_nulls#facebookincubator#3 95.71ms 10.45 array_constructor_MAP_nulls##2_null 77.22ms 12.95 array_constructor_MAP_nulls##2_const 114.91ms 8.70 Before: array_constructor_ROW_nullfree#facebookincubator#1 33.88ms 29.52 array_constructor_ROW_nullfree#facebookincubator#2 62.00ms 16.13 array_constructor_ROW_nullfree#facebookincubator#3 89.54ms 11.17 array_constructor_ROW_nullfree##2_null 78.46ms 12.75 array_constructor_ROW_nullfree##2_const 95.53ms 10.47 array_constructor_ROW_nulls#facebookincubator#1 44.11ms 22.67 array_constructor_ROW_nulls#facebookincubator#2 115.43ms 8.66 array_constructor_ROW_nulls#facebookincubator#3 173.61ms 5.76 array_constructor_ROW_nulls##2_null 130.40ms 7.67 array_constructor_ROW_nulls##2_const 169.97ms 5.88 After: array_constructor_ROW_nullfree#facebookincubator#1 5.55ms 180.15 array_constructor_ROW_nullfree#facebookincubator#2 12.83ms 77.94 array_constructor_ROW_nullfree#facebookincubator#3 18.89ms 52.95 array_constructor_ROW_nullfree##2_null 18.74ms 53.36 array_constructor_ROW_nullfree##2_const 18.16ms 55.07 array_constructor_ROW_nulls#facebookincubator#1 11.29ms 88.61 array_constructor_ROW_nulls#facebookincubator#2 18.57ms 53.86 array_constructor_ROW_nulls#facebookincubator#3 34.20ms 29.24 array_constructor_ROW_nulls##2_null 25.05ms 39.92 array_constructor_ROW_nulls##2_const 25.15ms 39.77 ``` Reviewed By: laithsakka Differential Revision: D49272797 fbshipit-source-id: 55d83de7b69c7ae4b72b5a5ae62a7868f36b0e19
codyschierbeck
pushed a commit
to codyschierbeck/velox
that referenced
this pull request
Sep 27, 2023
Summary: Pull Request resolved: facebookincubator#6566 FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls#facebookincubator#1 33.50ms 29.85 array_constructor_ARRAY_nulls#facebookincubator#2 59.05ms 16.93 array_constructor_ARRAY_nulls#facebookincubator#3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls#facebookincubator#1 30.51ms 32.78 array_constructor_ARRAY_nulls#facebookincubator#2 55.13ms 18.14 array_constructor_ARRAY_nulls#facebookincubator#3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls#facebookincubator#1 37.00ms 27.03 array_constructor_MAP_nulls#facebookincubator#2 67.76ms 14.76 array_constructor_MAP_nulls#facebookincubator#3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls#facebookincubator#1 34.34ms 29.12 array_constructor_MAP_nulls#facebookincubator#2 55.23ms 18.11 array_constructor_MAP_nulls#facebookincubator#3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Reviewed By: laithsakka Differential Revision: D49269500 fbshipit-source-id: 7b702921202f4bb8d10a252eb0ab20f0e5792ae6
ericyuliu
pushed a commit
to ericyuliu/velox
that referenced
this pull request
Oct 12, 2023
Summary: Pull Request resolved: facebookincubator#6568 array_constructor is very slow: facebookincubator#5958 (comment) array_constructor uses BaseVector::copyRanges, which is somewhat fast for arrays and maps, but very slow for primitive types: ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` FlatVector<T>::copy(source, rows, toSourceRow) is faster. Switching from copyRanges to copy speeds up array_constructor for primitive types and structs significantly. Yet, this change makes arrays and maps slower. The slowness is due to ArrayVector and MapVector not having implementation for copy(source, rows, toSourceRow). They rely on BaseVector::copy to translate rows + toSourceRow to ranges. This extra processing causes perf regression. Hence, we use copy for primitive types and structs of these and copyRanges for everything else. ``` Before: array_constructor_ARRAY_nullfree#facebookincubator#1 16.80ms 59.53 array_constructor_ARRAY_nullfree#facebookincubator#2 27.02ms 37.01 array_constructor_ARRAY_nullfree#facebookincubator#3 38.03ms 26.30 array_constructor_ARRAY_nullfree##2_null 52.86ms 18.92 array_constructor_ARRAY_nullfree##2_const 54.97ms 18.19 array_constructor_ARRAY_nulls#facebookincubator#1 30.61ms 32.66 array_constructor_ARRAY_nulls#facebookincubator#2 55.01ms 18.18 array_constructor_ARRAY_nulls#facebookincubator#3 80.69ms 12.39 array_constructor_ARRAY_nulls##2_null 69.10ms 14.47 array_constructor_ARRAY_nulls##2_const 103.85ms 9.63 After: array_constructor_ARRAY_nullfree#facebookincubator#1 15.25ms 65.58 array_constructor_ARRAY_nullfree#facebookincubator#2 25.11ms 39.82 array_constructor_ARRAY_nullfree#facebookincubator#3 34.59ms 28.91 array_constructor_ARRAY_nullfree##2_null 53.61ms 18.65 array_constructor_ARRAY_nullfree##2_const 51.48ms 19.42 array_constructor_ARRAY_nulls#facebookincubator#1 29.99ms 33.34 array_constructor_ARRAY_nulls#facebookincubator#2 55.91ms 17.89 array_constructor_ARRAY_nulls#facebookincubator#3 81.73ms 12.24 array_constructor_ARRAY_nulls##2_null 66.97ms 14.93 array_constructor_ARRAY_nulls##2_const 92.96ms 10.76 Before: array_constructor_INTEGER_nullfree#facebookincubator#1 19.72ms 50.71 array_constructor_INTEGER_nullfree#facebookincubator#2 34.51ms 28.97 array_constructor_INTEGER_nullfree#facebookincubator#3 47.95ms 20.86 array_constructor_INTEGER_nullfree##2_null 58.68ms 17.04 array_constructor_INTEGER_nullfree##2_const 45.15ms 22.15 array_constructor_INTEGER_nulls#facebookincubator#1 29.99ms 33.34 array_constructor_INTEGER_nulls#facebookincubator#2 55.32ms 18.08 array_constructor_INTEGER_nulls#facebookincubator#3 78.53ms 12.73 array_constructor_INTEGER_nulls##2_null 72.24ms 13.84 array_constructor_INTEGER_nulls##2_const 71.13ms 14.06 After: array_constructor_INTEGER_nullfree#facebookincubator#1 3.39ms 294.89 array_constructor_INTEGER_nullfree#facebookincubator#2 7.35ms 136.10 array_constructor_INTEGER_nullfree#facebookincubator#3 10.78ms 92.74 array_constructor_INTEGER_nullfree##2_null 11.29ms 88.57 array_constructor_INTEGER_nullfree##2_const 10.14ms 98.65 array_constructor_INTEGER_nulls#facebookincubator#1 4.49ms 222.53 array_constructor_INTEGER_nulls#facebookincubator#2 9.78ms 102.29 array_constructor_INTEGER_nulls#facebookincubator#3 14.69ms 68.08 array_constructor_INTEGER_nulls##2_null 12.14ms 82.36 array_constructor_INTEGER_nulls##2_const 12.27ms 81.53 Before: array_constructor_MAP_nullfree#facebookincubator#1 17.34ms 57.65 array_constructor_MAP_nullfree#facebookincubator#2 29.84ms 33.51 array_constructor_MAP_nullfree#facebookincubator#3 41.51ms 24.09 array_constructor_MAP_nullfree##2_null 56.57ms 17.68 array_constructor_MAP_nullfree##2_const 71.68ms 13.95 array_constructor_MAP_nulls#facebookincubator#1 36.22ms 27.61 array_constructor_MAP_nulls#facebookincubator#2 68.18ms 14.67 array_constructor_MAP_nulls#facebookincubator#3 95.12ms 10.51 array_constructor_MAP_nulls##2_null 86.42ms 11.57 array_constructor_MAP_nulls##2_const 120.10ms 8.33 After: array_constructor_MAP_nullfree#facebookincubator#1 17.05ms 58.66 array_constructor_MAP_nullfree#facebookincubator#2 28.42ms 35.18 array_constructor_MAP_nullfree#facebookincubator#3 36.96ms 27.06 array_constructor_MAP_nullfree##2_null 55.64ms 17.97 array_constructor_MAP_nullfree##2_const 67.53ms 14.81 array_constructor_MAP_nulls#facebookincubator#1 32.91ms 30.39 array_constructor_MAP_nulls#facebookincubator#2 64.50ms 15.50 array_constructor_MAP_nulls#facebookincubator#3 95.71ms 10.45 array_constructor_MAP_nulls##2_null 77.22ms 12.95 array_constructor_MAP_nulls##2_const 114.91ms 8.70 Before: array_constructor_ROW_nullfree#facebookincubator#1 33.88ms 29.52 array_constructor_ROW_nullfree#facebookincubator#2 62.00ms 16.13 array_constructor_ROW_nullfree#facebookincubator#3 89.54ms 11.17 array_constructor_ROW_nullfree##2_null 78.46ms 12.75 array_constructor_ROW_nullfree##2_const 95.53ms 10.47 array_constructor_ROW_nulls#facebookincubator#1 44.11ms 22.67 array_constructor_ROW_nulls#facebookincubator#2 115.43ms 8.66 array_constructor_ROW_nulls#facebookincubator#3 173.61ms 5.76 array_constructor_ROW_nulls##2_null 130.40ms 7.67 array_constructor_ROW_nulls##2_const 169.97ms 5.88 After: array_constructor_ROW_nullfree#facebookincubator#1 5.55ms 180.15 array_constructor_ROW_nullfree#facebookincubator#2 12.83ms 77.94 array_constructor_ROW_nullfree#facebookincubator#3 18.89ms 52.95 array_constructor_ROW_nullfree##2_null 18.74ms 53.36 array_constructor_ROW_nullfree##2_const 18.16ms 55.07 array_constructor_ROW_nulls#facebookincubator#1 11.29ms 88.61 array_constructor_ROW_nulls#facebookincubator#2 18.57ms 53.86 array_constructor_ROW_nulls#facebookincubator#3 34.20ms 29.24 array_constructor_ROW_nulls##2_null 25.05ms 39.92 array_constructor_ROW_nulls##2_const 25.15ms 39.77 ``` Reviewed By: laithsakka Differential Revision: D49272797 fbshipit-source-id: 55d83de7b69c7ae4b72b5a5ae62a7868f36b0e19
ericyuliu
pushed a commit
to ericyuliu/velox
that referenced
this pull request
Oct 12, 2023
Summary: Pull Request resolved: facebookincubator#6566 FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls#facebookincubator#1 33.50ms 29.85 array_constructor_ARRAY_nulls#facebookincubator#2 59.05ms 16.93 array_constructor_ARRAY_nulls#facebookincubator#3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls#facebookincubator#1 30.51ms 32.78 array_constructor_ARRAY_nulls#facebookincubator#2 55.13ms 18.14 array_constructor_ARRAY_nulls#facebookincubator#3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls#facebookincubator#1 37.00ms 27.03 array_constructor_MAP_nulls#facebookincubator#2 67.76ms 14.76 array_constructor_MAP_nulls#facebookincubator#3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls#facebookincubator#1 34.34ms 29.12 array_constructor_MAP_nulls#facebookincubator#2 55.23ms 18.11 array_constructor_MAP_nulls#facebookincubator#3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Reviewed By: laithsakka Differential Revision: D49269500 fbshipit-source-id: 7b702921202f4bb8d10a252eb0ab20f0e5792ae6
laithsakka
added a commit
to laithsakka/velox
that referenced
this pull request
Oct 12, 2023
Summary: ``` ============================================================================ [...]hmarks/ExpressionBenchmarkBuilder.cpp relative time/iter iters/s ============================================================================ map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#1 71.48ms 13.99 map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#2 76.58ms 13.06 map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#3 85.31ms 11.72 map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#4 121.56ms 8.23 map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#1 27.19ms 36.78 map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#2 33.10ms 30.21 map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#3 33.47ms 29.88 map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#4 31.70ms 31.55 map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#1 26.92ms 37.14 map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#2 36.62ms 27.31 map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#3 34.19ms 29.24 map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#4 33.76ms 29.62 ``` Differential Revision: D50237919
laithsakka
added a commit
to laithsakka/velox
that referenced
this pull request
Oct 12, 2023
Summary: ``` ============================================================================ [...]hmarks/ExpressionBenchmarkBuilder.cpp relative time/iter iters/s ============================================================================ map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#1 71.48ms 13.99 map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#2 76.58ms 13.06 map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#3 85.31ms 11.72 map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#4 121.56ms 8.23 map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#1 27.19ms 36.78 map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#2 33.10ms 30.21 map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#3 33.47ms 29.88 map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#4 31.70ms 31.55 map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#1 26.92ms 37.14 map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#2 36.62ms 27.31 map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#3 34.19ms 29.24 map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#4 33.76ms 29.62 ``` Differential Revision: D50237919
laithsakka
added a commit
to laithsakka/velox
that referenced
this pull request
Oct 12, 2023
Summary: ``` ============================================================================ [...]hmarks/ExpressionBenchmarkBuilder.cpp relative time/iter iters/s ============================================================================ map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#1 71.48ms 13.99 map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#2 76.58ms 13.06 map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#3 85.31ms 11.72 map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#4 121.56ms 8.23 map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#1 27.19ms 36.78 map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#2 33.10ms 30.21 map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#3 33.47ms 29.88 map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#4 31.70ms 31.55 map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#1 26.92ms 37.14 map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#2 36.62ms 27.31 map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#3 34.19ms 29.24 map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#4 33.76ms 29.62 ``` Differential Revision: D50237919
laithsakka
added a commit
to laithsakka/velox
that referenced
this pull request
Oct 12, 2023
Summary: ``` ============================================================================ [...]hmarks/ExpressionBenchmarkBuilder.cpp relative time/iter iters/s ============================================================================ map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#1 71.48ms 13.99 map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#2 76.58ms 13.06 map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#3 85.31ms 11.72 map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#4 121.56ms 8.23 map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#1 27.19ms 36.78 map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#2 33.10ms 30.21 map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#3 33.47ms 29.88 map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#4 31.70ms 31.55 map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#1 26.92ms 37.14 map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#2 36.62ms 27.31 map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#3 34.19ms 29.24 map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#4 33.76ms 29.62 ``` Differential Revision: D50237919
TatianaJin
pushed a commit
to TatianaJin/velox
that referenced
this pull request
Nov 13, 2023
facebook-github-bot
pushed a commit
that referenced
this pull request
Aug 23, 2024
Summary: The default algorithm used is MD5. However, MD5 is not supported with fips and can cause a SIGSEGV. Set CRC32 instead which is a standard for checksum computation and is not restricted by fips. crc32 is also faster than md5. Internally at IBM, we hit the following SIGSEGV ``` 0x0000000000000000 in ?? () Missing separate debuginfos, use: dnf debuginfo-install openssl-fips-provider-3.0.7-2.el9.x86_64 xz-libs-5.2.5-8.el9_0.x86_64 (gdb) bt #0 0x0000000000000000 in ?? () #1 0x0000000004e5f89b in Aws::Utils::Crypto::MD5OpenSSLImpl::Calculate(std::istream&) () #2 0x0000000004efd298 in Aws::Utils::Crypto::MD5::Calculate(std::istream&) () #3 0x0000000004ef71b9 in Aws::Utils::HashingUtils::CalculateMD5(std::iostream&) () #4 0x0000000004e8ebe8 in Aws::Client::AWSClient::AddChecksumToRequest(std::shared_ptr<Aws::Http::HttpRequest> const&, Aws::AmazonWebServiceRequest const&) const () #5 0x0000000004e8ed15 in Aws::Client::AWSClient::BuildHttpRequest(Aws::AmazonWebServiceRequest const&, std::shared_ptr<Aws::Http::HttpRequest> const&) const () #6 0x0000000004e977f9 in Aws::Client::AWSClient::AttemptOneRequest(std::shared_ptr<Aws::Http::HttpRequest> const&, Aws::AmazonWebServiceRequest const&, char const*, char const*, char const*) const () #7 0x0000000004e9e1c0 in Aws::Client::AWSClient::AttemptExhaustively(Aws::Http::URI const&, Aws::AmazonWebServiceRequest const&, Aws::Http::HttpMethod, char const*, char const*, char const*) const () #8 0x0000000004ea15e8 in Aws::Client::AWSXMLClient::MakeRequest(Aws::Http::URI const&, Aws::AmazonWebServiceRequest const&, Aws::Http::HttpMethod, char const*, char const*, char const*) const () #9 0x0000000004ea1f70 in Aws::Client::AWSXMLClient::MakeRequest(Aws::AmazonWebServiceRequest const&, Aws::Endpoint::AWSEndpoint const&, Aws::Http::HttpMethod, char const*, char const*, char const*) const () #10 0x0000000004de0933 in Aws::S3::S3Client::UploadPart(Aws::S3::Model::UploadPartRequest const&) const::{lambda()https://github.com/facebookincubator/velox/issues/1}::operator()() const () #11 0x0000000004de0b8c in std::_Function_handler<Aws::Utils::Outcome<Aws::S3::Model::UploadPartResult, Aws::S3::S3Error> (), Aws::S3::S3Client::UploadPart(Aws::S3::Model::UploadPartRequest const&) const::{lambda()https://github.com/facebookincubator/velox/issues/1}>::_M_invoke(std::_Any_data const&) () #12 0x0000000004e19317 in Aws::Utils::Outcome<Aws::S3::Model::UploadPartResult, Aws::S3::S3Error> smithy::components::tracing::TracingUtils::MakeCallWithTiming<Aws::Utils::Outcome<Aws::S3::Model::UploadPartResult, Aws::S3::S3Error> >(std::function<Aws::Utils::Outcome<Aws::S3::Model::UploadPartResult, Aws::S3::S3Error> ()>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, smithy::components::tracing::Meter const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () #13 0x0000000004d7cdcf in Aws::S3::S3Client::UploadPart(Aws::S3::Model::UploadPartRequest const&) const () #14 0x0000000004ca4aa6 in facebook::velox::filesystems::S3WriteFile::Impl::uploadPart (this=0x7fffec2f09a0, part=..., isLast=true) at /root/velox/velox/connectors/hive/storage_adapters/s3fs/S3FileSystem.cpp:380 ``` Pull Request resolved: #10801 Reviewed By: amitkdutta Differential Revision: D61671574 Pulled By: kgpai fbshipit-source-id: 34c7b777b3fde0659ef74c4fbfd93740fdfa3f7c
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
fb-exported
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Differential Revision: D30185776