Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize array_constructor #6568

Closed

Conversation

mbasmanova
Copy link
Contributor

Summary:
array_constructor is very slow: #5958 (comment)

array_constructor uses BaseVector::copyRanges, which is somewhat fast for arrays and maps, but very slow for primitive types:

FlatVector.h

  void copyRanges(
      const BaseVector* source,
      const folly::Range<const BaseVector::CopyRange*>& ranges) override {
    for (auto& range : ranges) {
      copy(source, range.targetIndex, range.sourceIndex, range.count);
    }
  }

FlatVector::copy(source, rows, toSourceRow) is faster.

Switching from copyRanges to copy speeds up array_constructor for primitive types and structs significantly. Yet, this change makes arrays and maps slower.

The slowness is due to ArrayVector and MapVector not having implementation for copy(source, rows, toSourceRow). They rely on BaseVector::copy to translate rows + toSourceRow to ranges. This extra processing causes perf regression.

Hence, we use copy for primitive types and structs of these and copyRanges for everything else.

Before:

array_constructor_ARRAY_nullfree##1                        16.80ms     59.53
array_constructor_ARRAY_nullfree##2                        27.02ms     37.01
array_constructor_ARRAY_nullfree##3                        38.03ms     26.30
array_constructor_ARRAY_nullfree##2_null                   52.86ms     18.92
array_constructor_ARRAY_nullfree##2_const                  54.97ms     18.19
array_constructor_ARRAY_nulls##1                           30.61ms     32.66
array_constructor_ARRAY_nulls##2                           55.01ms     18.18
array_constructor_ARRAY_nulls##3                           80.69ms     12.39
array_constructor_ARRAY_nulls##2_null                      69.10ms     14.47
array_constructor_ARRAY_nulls##2_const                    103.85ms      9.63


After:

array_constructor_ARRAY_nullfree##1                        15.25ms     65.58
array_constructor_ARRAY_nullfree##2                        25.11ms     39.82
array_constructor_ARRAY_nullfree##3                        34.59ms     28.91
array_constructor_ARRAY_nullfree##2_null                   53.61ms     18.65
array_constructor_ARRAY_nullfree##2_const                  51.48ms     19.42
array_constructor_ARRAY_nulls##1                           29.99ms     33.34
array_constructor_ARRAY_nulls##2                           55.91ms     17.89
array_constructor_ARRAY_nulls##3                           81.73ms     12.24
array_constructor_ARRAY_nulls##2_null                      66.97ms     14.93
array_constructor_ARRAY_nulls##2_const                     92.96ms     10.76


Before:

array_constructor_INTEGER_nullfree##1                      19.72ms     50.71
array_constructor_INTEGER_nullfree##2                      34.51ms     28.97
array_constructor_INTEGER_nullfree##3                      47.95ms     20.86
array_constructor_INTEGER_nullfree##2_null                 58.68ms     17.04
array_constructor_INTEGER_nullfree##2_const                45.15ms     22.15
array_constructor_INTEGER_nulls##1                         29.99ms     33.34
array_constructor_INTEGER_nulls##2                         55.32ms     18.08
array_constructor_INTEGER_nulls##3                         78.53ms     12.73
array_constructor_INTEGER_nulls##2_null                    72.24ms     13.84
array_constructor_INTEGER_nulls##2_const                   71.13ms     14.06


After:

array_constructor_INTEGER_nullfree##1                       3.39ms    294.89
array_constructor_INTEGER_nullfree##2                       7.35ms    136.10
array_constructor_INTEGER_nullfree##3                      10.78ms     92.74
array_constructor_INTEGER_nullfree##2_null                 11.29ms     88.57
array_constructor_INTEGER_nullfree##2_const                10.14ms     98.65
array_constructor_INTEGER_nulls##1                          4.49ms    222.53
array_constructor_INTEGER_nulls##2                          9.78ms    102.29
array_constructor_INTEGER_nulls##3                         14.69ms     68.08
array_constructor_INTEGER_nulls##2_null                    12.14ms     82.36
array_constructor_INTEGER_nulls##2_const                   12.27ms     81.53

Before:

array_constructor_MAP_nullfree##1                          17.34ms     57.65
array_constructor_MAP_nullfree##2                          29.84ms     33.51
array_constructor_MAP_nullfree##3                          41.51ms     24.09
array_constructor_MAP_nullfree##2_null                     56.57ms     17.68
array_constructor_MAP_nullfree##2_const                    71.68ms     13.95
array_constructor_MAP_nulls##1                             36.22ms     27.61
array_constructor_MAP_nulls##2                             68.18ms     14.67
array_constructor_MAP_nulls##3                             95.12ms     10.51
array_constructor_MAP_nulls##2_null                        86.42ms     11.57
array_constructor_MAP_nulls##2_const                      120.10ms      8.33


After:

array_constructor_MAP_nullfree##1                          17.05ms     58.66
array_constructor_MAP_nullfree##2                          28.42ms     35.18
array_constructor_MAP_nullfree##3                          36.96ms     27.06
array_constructor_MAP_nullfree##2_null                     55.64ms     17.97
array_constructor_MAP_nullfree##2_const                    67.53ms     14.81
array_constructor_MAP_nulls##1                             32.91ms     30.39
array_constructor_MAP_nulls##2                             64.50ms     15.50
array_constructor_MAP_nulls##3                             95.71ms     10.45
array_constructor_MAP_nulls##2_null                        77.22ms     12.95
array_constructor_MAP_nulls##2_const                      114.91ms      8.70

Before:

array_constructor_ROW_nullfree##1                          33.88ms     29.52
array_constructor_ROW_nullfree##2                          62.00ms     16.13
array_constructor_ROW_nullfree##3                          89.54ms     11.17
array_constructor_ROW_nullfree##2_null                     78.46ms     12.75
array_constructor_ROW_nullfree##2_const                    95.53ms     10.47
array_constructor_ROW_nulls##1                             44.11ms     22.67
array_constructor_ROW_nulls##2                            115.43ms      8.66
array_constructor_ROW_nulls##3                            173.61ms      5.76
array_constructor_ROW_nulls##2_null                       130.40ms      7.67
array_constructor_ROW_nulls##2_const                      169.97ms      5.88

After:

array_constructor_ROW_nullfree##1                           5.55ms    180.15
array_constructor_ROW_nullfree##2                          12.83ms     77.94
array_constructor_ROW_nullfree##3                          18.89ms     52.95
array_constructor_ROW_nullfree##2_null                     18.74ms     53.36
array_constructor_ROW_nullfree##2_const                    18.16ms     55.07
array_constructor_ROW_nulls##1                             11.29ms     88.61
array_constructor_ROW_nulls##2                             18.57ms     53.86
array_constructor_ROW_nulls##3                             34.20ms     29.24
array_constructor_ROW_nulls##2_null                        25.05ms     39.92
array_constructor_ROW_nulls##2_const                       25.15ms     39.77

Differential Revision: D49272797

@netlify
Copy link

netlify bot commented Sep 14, 2023

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 57ad163
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/650469fd5758220008bdaeda

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 14, 2023
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D49272797

return false;
}

if (type->isRow()) {
Copy link
Contributor

@laithsakka laithsakka Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: early exit and comment for row logic

if(!is->Row()){
   return true;
}
// Handle row. If any child is Map or Array we use copy ranges.
const auto& rowType = type->asRow();
for (const auto& child : rowType.children()) {
      if (shouldCopyRanges(child)) {
          return true;
        }
 }
return false;

Copy link
Contributor Author

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@laithsakka Thank you for review.

do we need the target rows set by the previous copy to still be valid
i saw this comment in BaseVector::copy(

  // Check if there are rows that do not exist in 'source'. Remove these from
  // 'ranges'.
  // TODO Update the callers and remove this logic.
would it be expensive if you reset target rows

I believe you refer to BaseVector::copy allowing invalid values in toSourceRow: #6591

If that's the case, then I do not believe this code creates such rows. Am I missing something?

mbasmanova added a commit to mbasmanova/velox-1 that referenced this pull request Sep 15, 2023
Summary:

array_constructor is very slow: facebookincubator#5958 (comment)

array_constructor uses BaseVector::copyRanges, which is somewhat fast for arrays and maps, but very slow for primitive types:

```
FlatVector.h

  void copyRanges(
      const BaseVector* source,
      const folly::Range<const BaseVector::CopyRange*>& ranges) override {
    for (auto& range : ranges) {
      copy(source, range.targetIndex, range.sourceIndex, range.count);
    }
  }
```

FlatVector<T>::copy(source, rows, toSourceRow) is faster.

Switching from copyRanges to copy speeds up array_constructor for primitive types and structs significantly. Yet, this change makes arrays and maps slower.

The slowness is due to ArrayVector and MapVector not having implementation for copy(source, rows, toSourceRow). They rely on BaseVector::copy to translate rows + toSourceRow to ranges. This extra processing causes perf regression.

Hence, we use copy for primitive types and structs of these and copyRanges for everything else.

```
Before:

array_constructor_ARRAY_nullfree#facebookincubator#1                        16.80ms     59.53
array_constructor_ARRAY_nullfree#facebookincubator#2                        27.02ms     37.01
array_constructor_ARRAY_nullfree#facebookincubator#3                        38.03ms     26.30
array_constructor_ARRAY_nullfree##2_null                   52.86ms     18.92
array_constructor_ARRAY_nullfree##2_const                  54.97ms     18.19
array_constructor_ARRAY_nulls#facebookincubator#1                           30.61ms     32.66
array_constructor_ARRAY_nulls#facebookincubator#2                           55.01ms     18.18
array_constructor_ARRAY_nulls#facebookincubator#3                           80.69ms     12.39
array_constructor_ARRAY_nulls##2_null                      69.10ms     14.47
array_constructor_ARRAY_nulls##2_const                    103.85ms      9.63


After:

array_constructor_ARRAY_nullfree#facebookincubator#1                        15.25ms     65.58
array_constructor_ARRAY_nullfree#facebookincubator#2                        25.11ms     39.82
array_constructor_ARRAY_nullfree#facebookincubator#3                        34.59ms     28.91
array_constructor_ARRAY_nullfree##2_null                   53.61ms     18.65
array_constructor_ARRAY_nullfree##2_const                  51.48ms     19.42
array_constructor_ARRAY_nulls#facebookincubator#1                           29.99ms     33.34
array_constructor_ARRAY_nulls#facebookincubator#2                           55.91ms     17.89
array_constructor_ARRAY_nulls#facebookincubator#3                           81.73ms     12.24
array_constructor_ARRAY_nulls##2_null                      66.97ms     14.93
array_constructor_ARRAY_nulls##2_const                     92.96ms     10.76


Before:

array_constructor_INTEGER_nullfree#facebookincubator#1                      19.72ms     50.71
array_constructor_INTEGER_nullfree#facebookincubator#2                      34.51ms     28.97
array_constructor_INTEGER_nullfree#facebookincubator#3                      47.95ms     20.86
array_constructor_INTEGER_nullfree##2_null                 58.68ms     17.04
array_constructor_INTEGER_nullfree##2_const                45.15ms     22.15
array_constructor_INTEGER_nulls#facebookincubator#1                         29.99ms     33.34
array_constructor_INTEGER_nulls#facebookincubator#2                         55.32ms     18.08
array_constructor_INTEGER_nulls#facebookincubator#3                         78.53ms     12.73
array_constructor_INTEGER_nulls##2_null                    72.24ms     13.84
array_constructor_INTEGER_nulls##2_const                   71.13ms     14.06


After:

array_constructor_INTEGER_nullfree#facebookincubator#1                       3.39ms    294.89
array_constructor_INTEGER_nullfree#facebookincubator#2                       7.35ms    136.10
array_constructor_INTEGER_nullfree#facebookincubator#3                      10.78ms     92.74
array_constructor_INTEGER_nullfree##2_null                 11.29ms     88.57
array_constructor_INTEGER_nullfree##2_const                10.14ms     98.65
array_constructor_INTEGER_nulls#facebookincubator#1                          4.49ms    222.53
array_constructor_INTEGER_nulls#facebookincubator#2                          9.78ms    102.29
array_constructor_INTEGER_nulls#facebookincubator#3                         14.69ms     68.08
array_constructor_INTEGER_nulls##2_null                    12.14ms     82.36
array_constructor_INTEGER_nulls##2_const                   12.27ms     81.53

Before:

array_constructor_MAP_nullfree#facebookincubator#1                          17.34ms     57.65
array_constructor_MAP_nullfree#facebookincubator#2                          29.84ms     33.51
array_constructor_MAP_nullfree#facebookincubator#3                          41.51ms     24.09
array_constructor_MAP_nullfree##2_null                     56.57ms     17.68
array_constructor_MAP_nullfree##2_const                    71.68ms     13.95
array_constructor_MAP_nulls#facebookincubator#1                             36.22ms     27.61
array_constructor_MAP_nulls#facebookincubator#2                             68.18ms     14.67
array_constructor_MAP_nulls#facebookincubator#3                             95.12ms     10.51
array_constructor_MAP_nulls##2_null                        86.42ms     11.57
array_constructor_MAP_nulls##2_const                      120.10ms      8.33


After:

array_constructor_MAP_nullfree#facebookincubator#1                          17.05ms     58.66
array_constructor_MAP_nullfree#facebookincubator#2                          28.42ms     35.18
array_constructor_MAP_nullfree#facebookincubator#3                          36.96ms     27.06
array_constructor_MAP_nullfree##2_null                     55.64ms     17.97
array_constructor_MAP_nullfree##2_const                    67.53ms     14.81
array_constructor_MAP_nulls#facebookincubator#1                             32.91ms     30.39
array_constructor_MAP_nulls#facebookincubator#2                             64.50ms     15.50
array_constructor_MAP_nulls#facebookincubator#3                             95.71ms     10.45
array_constructor_MAP_nulls##2_null                        77.22ms     12.95
array_constructor_MAP_nulls##2_const                      114.91ms      8.70

Before:

array_constructor_ROW_nullfree#facebookincubator#1                          33.88ms     29.52
array_constructor_ROW_nullfree#facebookincubator#2                          62.00ms     16.13
array_constructor_ROW_nullfree#facebookincubator#3                          89.54ms     11.17
array_constructor_ROW_nullfree##2_null                     78.46ms     12.75
array_constructor_ROW_nullfree##2_const                    95.53ms     10.47
array_constructor_ROW_nulls#facebookincubator#1                             44.11ms     22.67
array_constructor_ROW_nulls#facebookincubator#2                            115.43ms      8.66
array_constructor_ROW_nulls#facebookincubator#3                            173.61ms      5.76
array_constructor_ROW_nulls##2_null                       130.40ms      7.67
array_constructor_ROW_nulls##2_const                      169.97ms      5.88

After:

array_constructor_ROW_nullfree#facebookincubator#1                           5.55ms    180.15
array_constructor_ROW_nullfree#facebookincubator#2                          12.83ms     77.94
array_constructor_ROW_nullfree#facebookincubator#3                          18.89ms     52.95
array_constructor_ROW_nullfree##2_null                     18.74ms     53.36
array_constructor_ROW_nullfree##2_const                    18.16ms     55.07
array_constructor_ROW_nulls#facebookincubator#1                             11.29ms     88.61
array_constructor_ROW_nulls#facebookincubator#2                             18.57ms     53.86
array_constructor_ROW_nulls#facebookincubator#3                             34.20ms     29.24
array_constructor_ROW_nulls##2_null                        25.05ms     39.92
array_constructor_ROW_nulls##2_const                       25.15ms     39.77
```

Differential Revision: D49272797
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D49272797

mbasmanova added a commit to mbasmanova/velox-1 that referenced this pull request Sep 15, 2023
Summary:

array_constructor is very slow: facebookincubator#5958 (comment)

array_constructor uses BaseVector::copyRanges, which is somewhat fast for arrays and maps, but very slow for primitive types:

```
FlatVector.h

  void copyRanges(
      const BaseVector* source,
      const folly::Range<const BaseVector::CopyRange*>& ranges) override {
    for (auto& range : ranges) {
      copy(source, range.targetIndex, range.sourceIndex, range.count);
    }
  }
```

FlatVector<T>::copy(source, rows, toSourceRow) is faster.

Switching from copyRanges to copy speeds up array_constructor for primitive types and structs significantly. Yet, this change makes arrays and maps slower.

The slowness is due to ArrayVector and MapVector not having implementation for copy(source, rows, toSourceRow). They rely on BaseVector::copy to translate rows + toSourceRow to ranges. This extra processing causes perf regression.

Hence, we use copy for primitive types and structs of these and copyRanges for everything else.

```
Before:

array_constructor_ARRAY_nullfree#facebookincubator#1                        16.80ms     59.53
array_constructor_ARRAY_nullfree#facebookincubator#2                        27.02ms     37.01
array_constructor_ARRAY_nullfree#facebookincubator#3                        38.03ms     26.30
array_constructor_ARRAY_nullfree##2_null                   52.86ms     18.92
array_constructor_ARRAY_nullfree##2_const                  54.97ms     18.19
array_constructor_ARRAY_nulls#facebookincubator#1                           30.61ms     32.66
array_constructor_ARRAY_nulls#facebookincubator#2                           55.01ms     18.18
array_constructor_ARRAY_nulls#facebookincubator#3                           80.69ms     12.39
array_constructor_ARRAY_nulls##2_null                      69.10ms     14.47
array_constructor_ARRAY_nulls##2_const                    103.85ms      9.63


After:

array_constructor_ARRAY_nullfree#facebookincubator#1                        15.25ms     65.58
array_constructor_ARRAY_nullfree#facebookincubator#2                        25.11ms     39.82
array_constructor_ARRAY_nullfree#facebookincubator#3                        34.59ms     28.91
array_constructor_ARRAY_nullfree##2_null                   53.61ms     18.65
array_constructor_ARRAY_nullfree##2_const                  51.48ms     19.42
array_constructor_ARRAY_nulls#facebookincubator#1                           29.99ms     33.34
array_constructor_ARRAY_nulls#facebookincubator#2                           55.91ms     17.89
array_constructor_ARRAY_nulls#facebookincubator#3                           81.73ms     12.24
array_constructor_ARRAY_nulls##2_null                      66.97ms     14.93
array_constructor_ARRAY_nulls##2_const                     92.96ms     10.76


Before:

array_constructor_INTEGER_nullfree#facebookincubator#1                      19.72ms     50.71
array_constructor_INTEGER_nullfree#facebookincubator#2                      34.51ms     28.97
array_constructor_INTEGER_nullfree#facebookincubator#3                      47.95ms     20.86
array_constructor_INTEGER_nullfree##2_null                 58.68ms     17.04
array_constructor_INTEGER_nullfree##2_const                45.15ms     22.15
array_constructor_INTEGER_nulls#facebookincubator#1                         29.99ms     33.34
array_constructor_INTEGER_nulls#facebookincubator#2                         55.32ms     18.08
array_constructor_INTEGER_nulls#facebookincubator#3                         78.53ms     12.73
array_constructor_INTEGER_nulls##2_null                    72.24ms     13.84
array_constructor_INTEGER_nulls##2_const                   71.13ms     14.06


After:

array_constructor_INTEGER_nullfree#facebookincubator#1                       3.39ms    294.89
array_constructor_INTEGER_nullfree#facebookincubator#2                       7.35ms    136.10
array_constructor_INTEGER_nullfree#facebookincubator#3                      10.78ms     92.74
array_constructor_INTEGER_nullfree##2_null                 11.29ms     88.57
array_constructor_INTEGER_nullfree##2_const                10.14ms     98.65
array_constructor_INTEGER_nulls#facebookincubator#1                          4.49ms    222.53
array_constructor_INTEGER_nulls#facebookincubator#2                          9.78ms    102.29
array_constructor_INTEGER_nulls#facebookincubator#3                         14.69ms     68.08
array_constructor_INTEGER_nulls##2_null                    12.14ms     82.36
array_constructor_INTEGER_nulls##2_const                   12.27ms     81.53

Before:

array_constructor_MAP_nullfree#facebookincubator#1                          17.34ms     57.65
array_constructor_MAP_nullfree#facebookincubator#2                          29.84ms     33.51
array_constructor_MAP_nullfree#facebookincubator#3                          41.51ms     24.09
array_constructor_MAP_nullfree##2_null                     56.57ms     17.68
array_constructor_MAP_nullfree##2_const                    71.68ms     13.95
array_constructor_MAP_nulls#facebookincubator#1                             36.22ms     27.61
array_constructor_MAP_nulls#facebookincubator#2                             68.18ms     14.67
array_constructor_MAP_nulls#facebookincubator#3                             95.12ms     10.51
array_constructor_MAP_nulls##2_null                        86.42ms     11.57
array_constructor_MAP_nulls##2_const                      120.10ms      8.33


After:

array_constructor_MAP_nullfree#facebookincubator#1                          17.05ms     58.66
array_constructor_MAP_nullfree#facebookincubator#2                          28.42ms     35.18
array_constructor_MAP_nullfree#facebookincubator#3                          36.96ms     27.06
array_constructor_MAP_nullfree##2_null                     55.64ms     17.97
array_constructor_MAP_nullfree##2_const                    67.53ms     14.81
array_constructor_MAP_nulls#facebookincubator#1                             32.91ms     30.39
array_constructor_MAP_nulls#facebookincubator#2                             64.50ms     15.50
array_constructor_MAP_nulls#facebookincubator#3                             95.71ms     10.45
array_constructor_MAP_nulls##2_null                        77.22ms     12.95
array_constructor_MAP_nulls##2_const                      114.91ms      8.70

Before:

array_constructor_ROW_nullfree#facebookincubator#1                          33.88ms     29.52
array_constructor_ROW_nullfree#facebookincubator#2                          62.00ms     16.13
array_constructor_ROW_nullfree#facebookincubator#3                          89.54ms     11.17
array_constructor_ROW_nullfree##2_null                     78.46ms     12.75
array_constructor_ROW_nullfree##2_const                    95.53ms     10.47
array_constructor_ROW_nulls#facebookincubator#1                             44.11ms     22.67
array_constructor_ROW_nulls#facebookincubator#2                            115.43ms      8.66
array_constructor_ROW_nulls#facebookincubator#3                            173.61ms      5.76
array_constructor_ROW_nulls##2_null                       130.40ms      7.67
array_constructor_ROW_nulls##2_const                      169.97ms      5.88

After:

array_constructor_ROW_nullfree#facebookincubator#1                           5.55ms    180.15
array_constructor_ROW_nullfree#facebookincubator#2                          12.83ms     77.94
array_constructor_ROW_nullfree#facebookincubator#3                          18.89ms     52.95
array_constructor_ROW_nullfree##2_null                     18.74ms     53.36
array_constructor_ROW_nullfree##2_const                    18.16ms     55.07
array_constructor_ROW_nulls#facebookincubator#1                             11.29ms     88.61
array_constructor_ROW_nulls#facebookincubator#2                             18.57ms     53.86
array_constructor_ROW_nulls#facebookincubator#3                             34.20ms     29.24
array_constructor_ROW_nulls##2_null                        25.05ms     39.92
array_constructor_ROW_nulls##2_const                       25.15ms     39.77
```

Differential Revision: D49272797
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D49272797

mbasmanova added a commit to mbasmanova/velox-1 that referenced this pull request Sep 15, 2023
Summary:

array_constructor is very slow: facebookincubator#5958 (comment)

array_constructor uses BaseVector::copyRanges, which is somewhat fast for arrays and maps, but very slow for primitive types:

```
FlatVector.h

  void copyRanges(
      const BaseVector* source,
      const folly::Range<const BaseVector::CopyRange*>& ranges) override {
    for (auto& range : ranges) {
      copy(source, range.targetIndex, range.sourceIndex, range.count);
    }
  }
```

FlatVector<T>::copy(source, rows, toSourceRow) is faster.

Switching from copyRanges to copy speeds up array_constructor for primitive types and structs significantly. Yet, this change makes arrays and maps slower.

The slowness is due to ArrayVector and MapVector not having implementation for copy(source, rows, toSourceRow). They rely on BaseVector::copy to translate rows + toSourceRow to ranges. This extra processing causes perf regression.

Hence, we use copy for primitive types and structs of these and copyRanges for everything else.

```
Before:

array_constructor_ARRAY_nullfree#facebookincubator#1                        16.80ms     59.53
array_constructor_ARRAY_nullfree#facebookincubator#2                        27.02ms     37.01
array_constructor_ARRAY_nullfree#facebookincubator#3                        38.03ms     26.30
array_constructor_ARRAY_nullfree##2_null                   52.86ms     18.92
array_constructor_ARRAY_nullfree##2_const                  54.97ms     18.19
array_constructor_ARRAY_nulls#facebookincubator#1                           30.61ms     32.66
array_constructor_ARRAY_nulls#facebookincubator#2                           55.01ms     18.18
array_constructor_ARRAY_nulls#facebookincubator#3                           80.69ms     12.39
array_constructor_ARRAY_nulls##2_null                      69.10ms     14.47
array_constructor_ARRAY_nulls##2_const                    103.85ms      9.63


After:

array_constructor_ARRAY_nullfree#facebookincubator#1                        15.25ms     65.58
array_constructor_ARRAY_nullfree#facebookincubator#2                        25.11ms     39.82
array_constructor_ARRAY_nullfree#facebookincubator#3                        34.59ms     28.91
array_constructor_ARRAY_nullfree##2_null                   53.61ms     18.65
array_constructor_ARRAY_nullfree##2_const                  51.48ms     19.42
array_constructor_ARRAY_nulls#facebookincubator#1                           29.99ms     33.34
array_constructor_ARRAY_nulls#facebookincubator#2                           55.91ms     17.89
array_constructor_ARRAY_nulls#facebookincubator#3                           81.73ms     12.24
array_constructor_ARRAY_nulls##2_null                      66.97ms     14.93
array_constructor_ARRAY_nulls##2_const                     92.96ms     10.76


Before:

array_constructor_INTEGER_nullfree#facebookincubator#1                      19.72ms     50.71
array_constructor_INTEGER_nullfree#facebookincubator#2                      34.51ms     28.97
array_constructor_INTEGER_nullfree#facebookincubator#3                      47.95ms     20.86
array_constructor_INTEGER_nullfree##2_null                 58.68ms     17.04
array_constructor_INTEGER_nullfree##2_const                45.15ms     22.15
array_constructor_INTEGER_nulls#facebookincubator#1                         29.99ms     33.34
array_constructor_INTEGER_nulls#facebookincubator#2                         55.32ms     18.08
array_constructor_INTEGER_nulls#facebookincubator#3                         78.53ms     12.73
array_constructor_INTEGER_nulls##2_null                    72.24ms     13.84
array_constructor_INTEGER_nulls##2_const                   71.13ms     14.06


After:

array_constructor_INTEGER_nullfree#facebookincubator#1                       3.39ms    294.89
array_constructor_INTEGER_nullfree#facebookincubator#2                       7.35ms    136.10
array_constructor_INTEGER_nullfree#facebookincubator#3                      10.78ms     92.74
array_constructor_INTEGER_nullfree##2_null                 11.29ms     88.57
array_constructor_INTEGER_nullfree##2_const                10.14ms     98.65
array_constructor_INTEGER_nulls#facebookincubator#1                          4.49ms    222.53
array_constructor_INTEGER_nulls#facebookincubator#2                          9.78ms    102.29
array_constructor_INTEGER_nulls#facebookincubator#3                         14.69ms     68.08
array_constructor_INTEGER_nulls##2_null                    12.14ms     82.36
array_constructor_INTEGER_nulls##2_const                   12.27ms     81.53

Before:

array_constructor_MAP_nullfree#facebookincubator#1                          17.34ms     57.65
array_constructor_MAP_nullfree#facebookincubator#2                          29.84ms     33.51
array_constructor_MAP_nullfree#facebookincubator#3                          41.51ms     24.09
array_constructor_MAP_nullfree##2_null                     56.57ms     17.68
array_constructor_MAP_nullfree##2_const                    71.68ms     13.95
array_constructor_MAP_nulls#facebookincubator#1                             36.22ms     27.61
array_constructor_MAP_nulls#facebookincubator#2                             68.18ms     14.67
array_constructor_MAP_nulls#facebookincubator#3                             95.12ms     10.51
array_constructor_MAP_nulls##2_null                        86.42ms     11.57
array_constructor_MAP_nulls##2_const                      120.10ms      8.33


After:

array_constructor_MAP_nullfree#facebookincubator#1                          17.05ms     58.66
array_constructor_MAP_nullfree#facebookincubator#2                          28.42ms     35.18
array_constructor_MAP_nullfree#facebookincubator#3                          36.96ms     27.06
array_constructor_MAP_nullfree##2_null                     55.64ms     17.97
array_constructor_MAP_nullfree##2_const                    67.53ms     14.81
array_constructor_MAP_nulls#facebookincubator#1                             32.91ms     30.39
array_constructor_MAP_nulls#facebookincubator#2                             64.50ms     15.50
array_constructor_MAP_nulls#facebookincubator#3                             95.71ms     10.45
array_constructor_MAP_nulls##2_null                        77.22ms     12.95
array_constructor_MAP_nulls##2_const                      114.91ms      8.70

Before:

array_constructor_ROW_nullfree#facebookincubator#1                          33.88ms     29.52
array_constructor_ROW_nullfree#facebookincubator#2                          62.00ms     16.13
array_constructor_ROW_nullfree#facebookincubator#3                          89.54ms     11.17
array_constructor_ROW_nullfree##2_null                     78.46ms     12.75
array_constructor_ROW_nullfree##2_const                    95.53ms     10.47
array_constructor_ROW_nulls#facebookincubator#1                             44.11ms     22.67
array_constructor_ROW_nulls#facebookincubator#2                            115.43ms      8.66
array_constructor_ROW_nulls#facebookincubator#3                            173.61ms      5.76
array_constructor_ROW_nulls##2_null                       130.40ms      7.67
array_constructor_ROW_nulls##2_const                      169.97ms      5.88

After:

array_constructor_ROW_nullfree#facebookincubator#1                           5.55ms    180.15
array_constructor_ROW_nullfree#facebookincubator#2                          12.83ms     77.94
array_constructor_ROW_nullfree#facebookincubator#3                          18.89ms     52.95
array_constructor_ROW_nullfree##2_null                     18.74ms     53.36
array_constructor_ROW_nullfree##2_const                    18.16ms     55.07
array_constructor_ROW_nulls#facebookincubator#1                             11.29ms     88.61
array_constructor_ROW_nulls#facebookincubator#2                             18.57ms     53.86
array_constructor_ROW_nulls#facebookincubator#3                             34.20ms     29.24
array_constructor_ROW_nulls##2_null                        25.05ms     39.92
array_constructor_ROW_nulls##2_const                       25.15ms     39.77
```

Differential Revision: D49272797
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D49272797

mbasmanova added a commit to mbasmanova/velox-1 that referenced this pull request Sep 15, 2023
Summary:

array_constructor is very slow: facebookincubator#5958 (comment)

array_constructor uses BaseVector::copyRanges, which is somewhat fast for arrays and maps, but very slow for primitive types:

```
FlatVector.h

  void copyRanges(
      const BaseVector* source,
      const folly::Range<const BaseVector::CopyRange*>& ranges) override {
    for (auto& range : ranges) {
      copy(source, range.targetIndex, range.sourceIndex, range.count);
    }
  }
```

FlatVector<T>::copy(source, rows, toSourceRow) is faster.

Switching from copyRanges to copy speeds up array_constructor for primitive types and structs significantly. Yet, this change makes arrays and maps slower.

The slowness is due to ArrayVector and MapVector not having implementation for copy(source, rows, toSourceRow). They rely on BaseVector::copy to translate rows + toSourceRow to ranges. This extra processing causes perf regression.

Hence, we use copy for primitive types and structs of these and copyRanges for everything else.

```
Before:

array_constructor_ARRAY_nullfree#facebookincubator#1                        16.80ms     59.53
array_constructor_ARRAY_nullfree#facebookincubator#2                        27.02ms     37.01
array_constructor_ARRAY_nullfree#facebookincubator#3                        38.03ms     26.30
array_constructor_ARRAY_nullfree##2_null                   52.86ms     18.92
array_constructor_ARRAY_nullfree##2_const                  54.97ms     18.19
array_constructor_ARRAY_nulls#facebookincubator#1                           30.61ms     32.66
array_constructor_ARRAY_nulls#facebookincubator#2                           55.01ms     18.18
array_constructor_ARRAY_nulls#facebookincubator#3                           80.69ms     12.39
array_constructor_ARRAY_nulls##2_null                      69.10ms     14.47
array_constructor_ARRAY_nulls##2_const                    103.85ms      9.63


After:

array_constructor_ARRAY_nullfree#facebookincubator#1                        15.25ms     65.58
array_constructor_ARRAY_nullfree#facebookincubator#2                        25.11ms     39.82
array_constructor_ARRAY_nullfree#facebookincubator#3                        34.59ms     28.91
array_constructor_ARRAY_nullfree##2_null                   53.61ms     18.65
array_constructor_ARRAY_nullfree##2_const                  51.48ms     19.42
array_constructor_ARRAY_nulls#facebookincubator#1                           29.99ms     33.34
array_constructor_ARRAY_nulls#facebookincubator#2                           55.91ms     17.89
array_constructor_ARRAY_nulls#facebookincubator#3                           81.73ms     12.24
array_constructor_ARRAY_nulls##2_null                      66.97ms     14.93
array_constructor_ARRAY_nulls##2_const                     92.96ms     10.76


Before:

array_constructor_INTEGER_nullfree#facebookincubator#1                      19.72ms     50.71
array_constructor_INTEGER_nullfree#facebookincubator#2                      34.51ms     28.97
array_constructor_INTEGER_nullfree#facebookincubator#3                      47.95ms     20.86
array_constructor_INTEGER_nullfree##2_null                 58.68ms     17.04
array_constructor_INTEGER_nullfree##2_const                45.15ms     22.15
array_constructor_INTEGER_nulls#facebookincubator#1                         29.99ms     33.34
array_constructor_INTEGER_nulls#facebookincubator#2                         55.32ms     18.08
array_constructor_INTEGER_nulls#facebookincubator#3                         78.53ms     12.73
array_constructor_INTEGER_nulls##2_null                    72.24ms     13.84
array_constructor_INTEGER_nulls##2_const                   71.13ms     14.06


After:

array_constructor_INTEGER_nullfree#facebookincubator#1                       3.39ms    294.89
array_constructor_INTEGER_nullfree#facebookincubator#2                       7.35ms    136.10
array_constructor_INTEGER_nullfree#facebookincubator#3                      10.78ms     92.74
array_constructor_INTEGER_nullfree##2_null                 11.29ms     88.57
array_constructor_INTEGER_nullfree##2_const                10.14ms     98.65
array_constructor_INTEGER_nulls#facebookincubator#1                          4.49ms    222.53
array_constructor_INTEGER_nulls#facebookincubator#2                          9.78ms    102.29
array_constructor_INTEGER_nulls#facebookincubator#3                         14.69ms     68.08
array_constructor_INTEGER_nulls##2_null                    12.14ms     82.36
array_constructor_INTEGER_nulls##2_const                   12.27ms     81.53

Before:

array_constructor_MAP_nullfree#facebookincubator#1                          17.34ms     57.65
array_constructor_MAP_nullfree#facebookincubator#2                          29.84ms     33.51
array_constructor_MAP_nullfree#facebookincubator#3                          41.51ms     24.09
array_constructor_MAP_nullfree##2_null                     56.57ms     17.68
array_constructor_MAP_nullfree##2_const                    71.68ms     13.95
array_constructor_MAP_nulls#facebookincubator#1                             36.22ms     27.61
array_constructor_MAP_nulls#facebookincubator#2                             68.18ms     14.67
array_constructor_MAP_nulls#facebookincubator#3                             95.12ms     10.51
array_constructor_MAP_nulls##2_null                        86.42ms     11.57
array_constructor_MAP_nulls##2_const                      120.10ms      8.33


After:

array_constructor_MAP_nullfree#facebookincubator#1                          17.05ms     58.66
array_constructor_MAP_nullfree#facebookincubator#2                          28.42ms     35.18
array_constructor_MAP_nullfree#facebookincubator#3                          36.96ms     27.06
array_constructor_MAP_nullfree##2_null                     55.64ms     17.97
array_constructor_MAP_nullfree##2_const                    67.53ms     14.81
array_constructor_MAP_nulls#facebookincubator#1                             32.91ms     30.39
array_constructor_MAP_nulls#facebookincubator#2                             64.50ms     15.50
array_constructor_MAP_nulls#facebookincubator#3                             95.71ms     10.45
array_constructor_MAP_nulls##2_null                        77.22ms     12.95
array_constructor_MAP_nulls##2_const                      114.91ms      8.70

Before:

array_constructor_ROW_nullfree#facebookincubator#1                          33.88ms     29.52
array_constructor_ROW_nullfree#facebookincubator#2                          62.00ms     16.13
array_constructor_ROW_nullfree#facebookincubator#3                          89.54ms     11.17
array_constructor_ROW_nullfree##2_null                     78.46ms     12.75
array_constructor_ROW_nullfree##2_const                    95.53ms     10.47
array_constructor_ROW_nulls#facebookincubator#1                             44.11ms     22.67
array_constructor_ROW_nulls#facebookincubator#2                            115.43ms      8.66
array_constructor_ROW_nulls#facebookincubator#3                            173.61ms      5.76
array_constructor_ROW_nulls##2_null                       130.40ms      7.67
array_constructor_ROW_nulls##2_const                      169.97ms      5.88

After:

array_constructor_ROW_nullfree#facebookincubator#1                           5.55ms    180.15
array_constructor_ROW_nullfree#facebookincubator#2                          12.83ms     77.94
array_constructor_ROW_nullfree#facebookincubator#3                          18.89ms     52.95
array_constructor_ROW_nullfree##2_null                     18.74ms     53.36
array_constructor_ROW_nullfree##2_const                    18.16ms     55.07
array_constructor_ROW_nulls#facebookincubator#1                             11.29ms     88.61
array_constructor_ROW_nulls#facebookincubator#2                             18.57ms     53.86
array_constructor_ROW_nulls#facebookincubator#3                             34.20ms     29.24
array_constructor_ROW_nulls##2_null                        25.05ms     39.92
array_constructor_ROW_nulls##2_const                       25.15ms     39.77
```

Differential Revision: D49272797
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D49272797

ExpressionBenchmarkBuilder benchmarkBuilder;

auto* pool = benchmarkBuilder.pool();
auto& vm = benchmarkBuilder.vectorMaker();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit :vm (full word)

Summary:

array_constructor is very slow: facebookincubator#5958 (comment)

array_constructor uses BaseVector::copyRanges, which is somewhat fast for arrays and maps, but very slow for primitive types:

```
FlatVector.h

  void copyRanges(
      const BaseVector* source,
      const folly::Range<const BaseVector::CopyRange*>& ranges) override {
    for (auto& range : ranges) {
      copy(source, range.targetIndex, range.sourceIndex, range.count);
    }
  }
```

FlatVector<T>::copy(source, rows, toSourceRow) is faster.

Switching from copyRanges to copy speeds up array_constructor for primitive types and structs significantly. Yet, this change makes arrays and maps slower.

The slowness is due to ArrayVector and MapVector not having implementation for copy(source, rows, toSourceRow). They rely on BaseVector::copy to translate rows + toSourceRow to ranges. This extra processing causes perf regression.

Hence, we use copy for primitive types and structs of these and copyRanges for everything else.

```
Before:

array_constructor_ARRAY_nullfree#facebookincubator#1                        16.80ms     59.53
array_constructor_ARRAY_nullfree#facebookincubator#2                        27.02ms     37.01
array_constructor_ARRAY_nullfree#facebookincubator#3                        38.03ms     26.30
array_constructor_ARRAY_nullfree##2_null                   52.86ms     18.92
array_constructor_ARRAY_nullfree##2_const                  54.97ms     18.19
array_constructor_ARRAY_nulls#facebookincubator#1                           30.61ms     32.66
array_constructor_ARRAY_nulls#facebookincubator#2                           55.01ms     18.18
array_constructor_ARRAY_nulls#facebookincubator#3                           80.69ms     12.39
array_constructor_ARRAY_nulls##2_null                      69.10ms     14.47
array_constructor_ARRAY_nulls##2_const                    103.85ms      9.63


After:

array_constructor_ARRAY_nullfree#facebookincubator#1                        15.25ms     65.58
array_constructor_ARRAY_nullfree#facebookincubator#2                        25.11ms     39.82
array_constructor_ARRAY_nullfree#facebookincubator#3                        34.59ms     28.91
array_constructor_ARRAY_nullfree##2_null                   53.61ms     18.65
array_constructor_ARRAY_nullfree##2_const                  51.48ms     19.42
array_constructor_ARRAY_nulls#facebookincubator#1                           29.99ms     33.34
array_constructor_ARRAY_nulls#facebookincubator#2                           55.91ms     17.89
array_constructor_ARRAY_nulls#facebookincubator#3                           81.73ms     12.24
array_constructor_ARRAY_nulls##2_null                      66.97ms     14.93
array_constructor_ARRAY_nulls##2_const                     92.96ms     10.76


Before:

array_constructor_INTEGER_nullfree#facebookincubator#1                      19.72ms     50.71
array_constructor_INTEGER_nullfree#facebookincubator#2                      34.51ms     28.97
array_constructor_INTEGER_nullfree#facebookincubator#3                      47.95ms     20.86
array_constructor_INTEGER_nullfree##2_null                 58.68ms     17.04
array_constructor_INTEGER_nullfree##2_const                45.15ms     22.15
array_constructor_INTEGER_nulls#facebookincubator#1                         29.99ms     33.34
array_constructor_INTEGER_nulls#facebookincubator#2                         55.32ms     18.08
array_constructor_INTEGER_nulls#facebookincubator#3                         78.53ms     12.73
array_constructor_INTEGER_nulls##2_null                    72.24ms     13.84
array_constructor_INTEGER_nulls##2_const                   71.13ms     14.06


After:

array_constructor_INTEGER_nullfree#facebookincubator#1                       3.39ms    294.89
array_constructor_INTEGER_nullfree#facebookincubator#2                       7.35ms    136.10
array_constructor_INTEGER_nullfree#facebookincubator#3                      10.78ms     92.74
array_constructor_INTEGER_nullfree##2_null                 11.29ms     88.57
array_constructor_INTEGER_nullfree##2_const                10.14ms     98.65
array_constructor_INTEGER_nulls#facebookincubator#1                          4.49ms    222.53
array_constructor_INTEGER_nulls#facebookincubator#2                          9.78ms    102.29
array_constructor_INTEGER_nulls#facebookincubator#3                         14.69ms     68.08
array_constructor_INTEGER_nulls##2_null                    12.14ms     82.36
array_constructor_INTEGER_nulls##2_const                   12.27ms     81.53

Before:

array_constructor_MAP_nullfree#facebookincubator#1                          17.34ms     57.65
array_constructor_MAP_nullfree#facebookincubator#2                          29.84ms     33.51
array_constructor_MAP_nullfree#facebookincubator#3                          41.51ms     24.09
array_constructor_MAP_nullfree##2_null                     56.57ms     17.68
array_constructor_MAP_nullfree##2_const                    71.68ms     13.95
array_constructor_MAP_nulls#facebookincubator#1                             36.22ms     27.61
array_constructor_MAP_nulls#facebookincubator#2                             68.18ms     14.67
array_constructor_MAP_nulls#facebookincubator#3                             95.12ms     10.51
array_constructor_MAP_nulls##2_null                        86.42ms     11.57
array_constructor_MAP_nulls##2_const                      120.10ms      8.33


After:

array_constructor_MAP_nullfree#facebookincubator#1                          17.05ms     58.66
array_constructor_MAP_nullfree#facebookincubator#2                          28.42ms     35.18
array_constructor_MAP_nullfree#facebookincubator#3                          36.96ms     27.06
array_constructor_MAP_nullfree##2_null                     55.64ms     17.97
array_constructor_MAP_nullfree##2_const                    67.53ms     14.81
array_constructor_MAP_nulls#facebookincubator#1                             32.91ms     30.39
array_constructor_MAP_nulls#facebookincubator#2                             64.50ms     15.50
array_constructor_MAP_nulls#facebookincubator#3                             95.71ms     10.45
array_constructor_MAP_nulls##2_null                        77.22ms     12.95
array_constructor_MAP_nulls##2_const                      114.91ms      8.70

Before:

array_constructor_ROW_nullfree#facebookincubator#1                          33.88ms     29.52
array_constructor_ROW_nullfree#facebookincubator#2                          62.00ms     16.13
array_constructor_ROW_nullfree#facebookincubator#3                          89.54ms     11.17
array_constructor_ROW_nullfree##2_null                     78.46ms     12.75
array_constructor_ROW_nullfree##2_const                    95.53ms     10.47
array_constructor_ROW_nulls#facebookincubator#1                             44.11ms     22.67
array_constructor_ROW_nulls#facebookincubator#2                            115.43ms      8.66
array_constructor_ROW_nulls#facebookincubator#3                            173.61ms      5.76
array_constructor_ROW_nulls##2_null                       130.40ms      7.67
array_constructor_ROW_nulls##2_const                      169.97ms      5.88

After:

array_constructor_ROW_nullfree#facebookincubator#1                           5.55ms    180.15
array_constructor_ROW_nullfree#facebookincubator#2                          12.83ms     77.94
array_constructor_ROW_nullfree#facebookincubator#3                          18.89ms     52.95
array_constructor_ROW_nullfree##2_null                     18.74ms     53.36
array_constructor_ROW_nullfree##2_const                    18.16ms     55.07
array_constructor_ROW_nulls#facebookincubator#1                             11.29ms     88.61
array_constructor_ROW_nulls#facebookincubator#2                             18.57ms     53.86
array_constructor_ROW_nulls#facebookincubator#3                             34.20ms     29.24
array_constructor_ROW_nulls##2_null                        25.05ms     39.92
array_constructor_ROW_nulls##2_const                       25.15ms     39.77
```

Reviewed By: laithsakka

Differential Revision: D49272797
mbasmanova added a commit to mbasmanova/velox-1 that referenced this pull request Sep 15, 2023
Summary:

array_constructor is very slow: facebookincubator#5958 (comment)

array_constructor uses BaseVector::copyRanges, which is somewhat fast for arrays and maps, but very slow for primitive types:

```
FlatVector.h

  void copyRanges(
      const BaseVector* source,
      const folly::Range<const BaseVector::CopyRange*>& ranges) override {
    for (auto& range : ranges) {
      copy(source, range.targetIndex, range.sourceIndex, range.count);
    }
  }
```

FlatVector<T>::copy(source, rows, toSourceRow) is faster.

Switching from copyRanges to copy speeds up array_constructor for primitive types and structs significantly. Yet, this change makes arrays and maps slower.

The slowness is due to ArrayVector and MapVector not having implementation for copy(source, rows, toSourceRow). They rely on BaseVector::copy to translate rows + toSourceRow to ranges. This extra processing causes perf regression.

Hence, we use copy for primitive types and structs of these and copyRanges for everything else.

```
Before:

array_constructor_ARRAY_nullfree#facebookincubator#1                        16.80ms     59.53
array_constructor_ARRAY_nullfree#facebookincubator#2                        27.02ms     37.01
array_constructor_ARRAY_nullfree#facebookincubator#3                        38.03ms     26.30
array_constructor_ARRAY_nullfree##2_null                   52.86ms     18.92
array_constructor_ARRAY_nullfree##2_const                  54.97ms     18.19
array_constructor_ARRAY_nulls#facebookincubator#1                           30.61ms     32.66
array_constructor_ARRAY_nulls#facebookincubator#2                           55.01ms     18.18
array_constructor_ARRAY_nulls#facebookincubator#3                           80.69ms     12.39
array_constructor_ARRAY_nulls##2_null                      69.10ms     14.47
array_constructor_ARRAY_nulls##2_const                    103.85ms      9.63


After:

array_constructor_ARRAY_nullfree#facebookincubator#1                        15.25ms     65.58
array_constructor_ARRAY_nullfree#facebookincubator#2                        25.11ms     39.82
array_constructor_ARRAY_nullfree#facebookincubator#3                        34.59ms     28.91
array_constructor_ARRAY_nullfree##2_null                   53.61ms     18.65
array_constructor_ARRAY_nullfree##2_const                  51.48ms     19.42
array_constructor_ARRAY_nulls#facebookincubator#1                           29.99ms     33.34
array_constructor_ARRAY_nulls#facebookincubator#2                           55.91ms     17.89
array_constructor_ARRAY_nulls#facebookincubator#3                           81.73ms     12.24
array_constructor_ARRAY_nulls##2_null                      66.97ms     14.93
array_constructor_ARRAY_nulls##2_const                     92.96ms     10.76


Before:

array_constructor_INTEGER_nullfree#facebookincubator#1                      19.72ms     50.71
array_constructor_INTEGER_nullfree#facebookincubator#2                      34.51ms     28.97
array_constructor_INTEGER_nullfree#facebookincubator#3                      47.95ms     20.86
array_constructor_INTEGER_nullfree##2_null                 58.68ms     17.04
array_constructor_INTEGER_nullfree##2_const                45.15ms     22.15
array_constructor_INTEGER_nulls#facebookincubator#1                         29.99ms     33.34
array_constructor_INTEGER_nulls#facebookincubator#2                         55.32ms     18.08
array_constructor_INTEGER_nulls#facebookincubator#3                         78.53ms     12.73
array_constructor_INTEGER_nulls##2_null                    72.24ms     13.84
array_constructor_INTEGER_nulls##2_const                   71.13ms     14.06


After:

array_constructor_INTEGER_nullfree#facebookincubator#1                       3.39ms    294.89
array_constructor_INTEGER_nullfree#facebookincubator#2                       7.35ms    136.10
array_constructor_INTEGER_nullfree#facebookincubator#3                      10.78ms     92.74
array_constructor_INTEGER_nullfree##2_null                 11.29ms     88.57
array_constructor_INTEGER_nullfree##2_const                10.14ms     98.65
array_constructor_INTEGER_nulls#facebookincubator#1                          4.49ms    222.53
array_constructor_INTEGER_nulls#facebookincubator#2                          9.78ms    102.29
array_constructor_INTEGER_nulls#facebookincubator#3                         14.69ms     68.08
array_constructor_INTEGER_nulls##2_null                    12.14ms     82.36
array_constructor_INTEGER_nulls##2_const                   12.27ms     81.53

Before:

array_constructor_MAP_nullfree#facebookincubator#1                          17.34ms     57.65
array_constructor_MAP_nullfree#facebookincubator#2                          29.84ms     33.51
array_constructor_MAP_nullfree#facebookincubator#3                          41.51ms     24.09
array_constructor_MAP_nullfree##2_null                     56.57ms     17.68
array_constructor_MAP_nullfree##2_const                    71.68ms     13.95
array_constructor_MAP_nulls#facebookincubator#1                             36.22ms     27.61
array_constructor_MAP_nulls#facebookincubator#2                             68.18ms     14.67
array_constructor_MAP_nulls#facebookincubator#3                             95.12ms     10.51
array_constructor_MAP_nulls##2_null                        86.42ms     11.57
array_constructor_MAP_nulls##2_const                      120.10ms      8.33


After:

array_constructor_MAP_nullfree#facebookincubator#1                          17.05ms     58.66
array_constructor_MAP_nullfree#facebookincubator#2                          28.42ms     35.18
array_constructor_MAP_nullfree#facebookincubator#3                          36.96ms     27.06
array_constructor_MAP_nullfree##2_null                     55.64ms     17.97
array_constructor_MAP_nullfree##2_const                    67.53ms     14.81
array_constructor_MAP_nulls#facebookincubator#1                             32.91ms     30.39
array_constructor_MAP_nulls#facebookincubator#2                             64.50ms     15.50
array_constructor_MAP_nulls#facebookincubator#3                             95.71ms     10.45
array_constructor_MAP_nulls##2_null                        77.22ms     12.95
array_constructor_MAP_nulls##2_const                      114.91ms      8.70

Before:

array_constructor_ROW_nullfree#facebookincubator#1                          33.88ms     29.52
array_constructor_ROW_nullfree#facebookincubator#2                          62.00ms     16.13
array_constructor_ROW_nullfree#facebookincubator#3                          89.54ms     11.17
array_constructor_ROW_nullfree##2_null                     78.46ms     12.75
array_constructor_ROW_nullfree##2_const                    95.53ms     10.47
array_constructor_ROW_nulls#facebookincubator#1                             44.11ms     22.67
array_constructor_ROW_nulls#facebookincubator#2                            115.43ms      8.66
array_constructor_ROW_nulls#facebookincubator#3                            173.61ms      5.76
array_constructor_ROW_nulls##2_null                       130.40ms      7.67
array_constructor_ROW_nulls##2_const                      169.97ms      5.88

After:

array_constructor_ROW_nullfree#facebookincubator#1                           5.55ms    180.15
array_constructor_ROW_nullfree#facebookincubator#2                          12.83ms     77.94
array_constructor_ROW_nullfree#facebookincubator#3                          18.89ms     52.95
array_constructor_ROW_nullfree##2_null                     18.74ms     53.36
array_constructor_ROW_nullfree##2_const                    18.16ms     55.07
array_constructor_ROW_nulls#facebookincubator#1                             11.29ms     88.61
array_constructor_ROW_nulls#facebookincubator#2                             18.57ms     53.86
array_constructor_ROW_nulls#facebookincubator#3                             34.20ms     29.24
array_constructor_ROW_nulls##2_null                        25.05ms     39.92
array_constructor_ROW_nulls##2_const                       25.15ms     39.77
```

Reviewed By: laithsakka

Differential Revision: D49272797
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D49272797

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D49272797

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 812280c.

@conbench-facebook
Copy link

Conbench analyzed the 1 benchmark run on commit 812280ca.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

codyschierbeck pushed a commit to codyschierbeck/velox that referenced this pull request Sep 27, 2023
Summary:
Pull Request resolved: facebookincubator#6568

array_constructor is very slow: facebookincubator#5958 (comment)

array_constructor uses BaseVector::copyRanges, which is somewhat fast for arrays and maps, but very slow for primitive types:

```
FlatVector.h

  void copyRanges(
      const BaseVector* source,
      const folly::Range<const BaseVector::CopyRange*>& ranges) override {
    for (auto& range : ranges) {
      copy(source, range.targetIndex, range.sourceIndex, range.count);
    }
  }
```

FlatVector<T>::copy(source, rows, toSourceRow) is faster.

Switching from copyRanges to copy speeds up array_constructor for primitive types and structs significantly. Yet, this change makes arrays and maps slower.

The slowness is due to ArrayVector and MapVector not having implementation for copy(source, rows, toSourceRow). They rely on BaseVector::copy to translate rows + toSourceRow to ranges. This extra processing causes perf regression.

Hence, we use copy for primitive types and structs of these and copyRanges for everything else.

```
Before:

array_constructor_ARRAY_nullfree#facebookincubator#1                        16.80ms     59.53
array_constructor_ARRAY_nullfree#facebookincubator#2                        27.02ms     37.01
array_constructor_ARRAY_nullfree#facebookincubator#3                        38.03ms     26.30
array_constructor_ARRAY_nullfree##2_null                   52.86ms     18.92
array_constructor_ARRAY_nullfree##2_const                  54.97ms     18.19
array_constructor_ARRAY_nulls#facebookincubator#1                           30.61ms     32.66
array_constructor_ARRAY_nulls#facebookincubator#2                           55.01ms     18.18
array_constructor_ARRAY_nulls#facebookincubator#3                           80.69ms     12.39
array_constructor_ARRAY_nulls##2_null                      69.10ms     14.47
array_constructor_ARRAY_nulls##2_const                    103.85ms      9.63

After:

array_constructor_ARRAY_nullfree#facebookincubator#1                        15.25ms     65.58
array_constructor_ARRAY_nullfree#facebookincubator#2                        25.11ms     39.82
array_constructor_ARRAY_nullfree#facebookincubator#3                        34.59ms     28.91
array_constructor_ARRAY_nullfree##2_null                   53.61ms     18.65
array_constructor_ARRAY_nullfree##2_const                  51.48ms     19.42
array_constructor_ARRAY_nulls#facebookincubator#1                           29.99ms     33.34
array_constructor_ARRAY_nulls#facebookincubator#2                           55.91ms     17.89
array_constructor_ARRAY_nulls#facebookincubator#3                           81.73ms     12.24
array_constructor_ARRAY_nulls##2_null                      66.97ms     14.93
array_constructor_ARRAY_nulls##2_const                     92.96ms     10.76

Before:

array_constructor_INTEGER_nullfree#facebookincubator#1                      19.72ms     50.71
array_constructor_INTEGER_nullfree#facebookincubator#2                      34.51ms     28.97
array_constructor_INTEGER_nullfree#facebookincubator#3                      47.95ms     20.86
array_constructor_INTEGER_nullfree##2_null                 58.68ms     17.04
array_constructor_INTEGER_nullfree##2_const                45.15ms     22.15
array_constructor_INTEGER_nulls#facebookincubator#1                         29.99ms     33.34
array_constructor_INTEGER_nulls#facebookincubator#2                         55.32ms     18.08
array_constructor_INTEGER_nulls#facebookincubator#3                         78.53ms     12.73
array_constructor_INTEGER_nulls##2_null                    72.24ms     13.84
array_constructor_INTEGER_nulls##2_const                   71.13ms     14.06

After:

array_constructor_INTEGER_nullfree#facebookincubator#1                       3.39ms    294.89
array_constructor_INTEGER_nullfree#facebookincubator#2                       7.35ms    136.10
array_constructor_INTEGER_nullfree#facebookincubator#3                      10.78ms     92.74
array_constructor_INTEGER_nullfree##2_null                 11.29ms     88.57
array_constructor_INTEGER_nullfree##2_const                10.14ms     98.65
array_constructor_INTEGER_nulls#facebookincubator#1                          4.49ms    222.53
array_constructor_INTEGER_nulls#facebookincubator#2                          9.78ms    102.29
array_constructor_INTEGER_nulls#facebookincubator#3                         14.69ms     68.08
array_constructor_INTEGER_nulls##2_null                    12.14ms     82.36
array_constructor_INTEGER_nulls##2_const                   12.27ms     81.53

Before:

array_constructor_MAP_nullfree#facebookincubator#1                          17.34ms     57.65
array_constructor_MAP_nullfree#facebookincubator#2                          29.84ms     33.51
array_constructor_MAP_nullfree#facebookincubator#3                          41.51ms     24.09
array_constructor_MAP_nullfree##2_null                     56.57ms     17.68
array_constructor_MAP_nullfree##2_const                    71.68ms     13.95
array_constructor_MAP_nulls#facebookincubator#1                             36.22ms     27.61
array_constructor_MAP_nulls#facebookincubator#2                             68.18ms     14.67
array_constructor_MAP_nulls#facebookincubator#3                             95.12ms     10.51
array_constructor_MAP_nulls##2_null                        86.42ms     11.57
array_constructor_MAP_nulls##2_const                      120.10ms      8.33

After:

array_constructor_MAP_nullfree#facebookincubator#1                          17.05ms     58.66
array_constructor_MAP_nullfree#facebookincubator#2                          28.42ms     35.18
array_constructor_MAP_nullfree#facebookincubator#3                          36.96ms     27.06
array_constructor_MAP_nullfree##2_null                     55.64ms     17.97
array_constructor_MAP_nullfree##2_const                    67.53ms     14.81
array_constructor_MAP_nulls#facebookincubator#1                             32.91ms     30.39
array_constructor_MAP_nulls#facebookincubator#2                             64.50ms     15.50
array_constructor_MAP_nulls#facebookincubator#3                             95.71ms     10.45
array_constructor_MAP_nulls##2_null                        77.22ms     12.95
array_constructor_MAP_nulls##2_const                      114.91ms      8.70

Before:

array_constructor_ROW_nullfree#facebookincubator#1                          33.88ms     29.52
array_constructor_ROW_nullfree#facebookincubator#2                          62.00ms     16.13
array_constructor_ROW_nullfree#facebookincubator#3                          89.54ms     11.17
array_constructor_ROW_nullfree##2_null                     78.46ms     12.75
array_constructor_ROW_nullfree##2_const                    95.53ms     10.47
array_constructor_ROW_nulls#facebookincubator#1                             44.11ms     22.67
array_constructor_ROW_nulls#facebookincubator#2                            115.43ms      8.66
array_constructor_ROW_nulls#facebookincubator#3                            173.61ms      5.76
array_constructor_ROW_nulls##2_null                       130.40ms      7.67
array_constructor_ROW_nulls##2_const                      169.97ms      5.88

After:

array_constructor_ROW_nullfree#facebookincubator#1                           5.55ms    180.15
array_constructor_ROW_nullfree#facebookincubator#2                          12.83ms     77.94
array_constructor_ROW_nullfree#facebookincubator#3                          18.89ms     52.95
array_constructor_ROW_nullfree##2_null                     18.74ms     53.36
array_constructor_ROW_nullfree##2_const                    18.16ms     55.07
array_constructor_ROW_nulls#facebookincubator#1                             11.29ms     88.61
array_constructor_ROW_nulls#facebookincubator#2                             18.57ms     53.86
array_constructor_ROW_nulls#facebookincubator#3                             34.20ms     29.24
array_constructor_ROW_nulls##2_null                        25.05ms     39.92
array_constructor_ROW_nulls##2_const                       25.15ms     39.77
```

Reviewed By: laithsakka

Differential Revision: D49272797

fbshipit-source-id: 55d83de7b69c7ae4b72b5a5ae62a7868f36b0e19
codyschierbeck pushed a commit to codyschierbeck/velox that referenced this pull request Sep 27, 2023
Summary:
Pull Request resolved: facebookincubator#6568

array_constructor is very slow: facebookincubator#5958 (comment)

array_constructor uses BaseVector::copyRanges, which is somewhat fast for arrays and maps, but very slow for primitive types:

```
FlatVector.h

  void copyRanges(
      const BaseVector* source,
      const folly::Range<const BaseVector::CopyRange*>& ranges) override {
    for (auto& range : ranges) {
      copy(source, range.targetIndex, range.sourceIndex, range.count);
    }
  }
```

FlatVector<T>::copy(source, rows, toSourceRow) is faster.

Switching from copyRanges to copy speeds up array_constructor for primitive types and structs significantly. Yet, this change makes arrays and maps slower.

The slowness is due to ArrayVector and MapVector not having implementation for copy(source, rows, toSourceRow). They rely on BaseVector::copy to translate rows + toSourceRow to ranges. This extra processing causes perf regression.

Hence, we use copy for primitive types and structs of these and copyRanges for everything else.

```
Before:

array_constructor_ARRAY_nullfree#facebookincubator#1                        16.80ms     59.53
array_constructor_ARRAY_nullfree#facebookincubator#2                        27.02ms     37.01
array_constructor_ARRAY_nullfree#facebookincubator#3                        38.03ms     26.30
array_constructor_ARRAY_nullfree##2_null                   52.86ms     18.92
array_constructor_ARRAY_nullfree##2_const                  54.97ms     18.19
array_constructor_ARRAY_nulls#facebookincubator#1                           30.61ms     32.66
array_constructor_ARRAY_nulls#facebookincubator#2                           55.01ms     18.18
array_constructor_ARRAY_nulls#facebookincubator#3                           80.69ms     12.39
array_constructor_ARRAY_nulls##2_null                      69.10ms     14.47
array_constructor_ARRAY_nulls##2_const                    103.85ms      9.63

After:

array_constructor_ARRAY_nullfree#facebookincubator#1                        15.25ms     65.58
array_constructor_ARRAY_nullfree#facebookincubator#2                        25.11ms     39.82
array_constructor_ARRAY_nullfree#facebookincubator#3                        34.59ms     28.91
array_constructor_ARRAY_nullfree##2_null                   53.61ms     18.65
array_constructor_ARRAY_nullfree##2_const                  51.48ms     19.42
array_constructor_ARRAY_nulls#facebookincubator#1                           29.99ms     33.34
array_constructor_ARRAY_nulls#facebookincubator#2                           55.91ms     17.89
array_constructor_ARRAY_nulls#facebookincubator#3                           81.73ms     12.24
array_constructor_ARRAY_nulls##2_null                      66.97ms     14.93
array_constructor_ARRAY_nulls##2_const                     92.96ms     10.76

Before:

array_constructor_INTEGER_nullfree#facebookincubator#1                      19.72ms     50.71
array_constructor_INTEGER_nullfree#facebookincubator#2                      34.51ms     28.97
array_constructor_INTEGER_nullfree#facebookincubator#3                      47.95ms     20.86
array_constructor_INTEGER_nullfree##2_null                 58.68ms     17.04
array_constructor_INTEGER_nullfree##2_const                45.15ms     22.15
array_constructor_INTEGER_nulls#facebookincubator#1                         29.99ms     33.34
array_constructor_INTEGER_nulls#facebookincubator#2                         55.32ms     18.08
array_constructor_INTEGER_nulls#facebookincubator#3                         78.53ms     12.73
array_constructor_INTEGER_nulls##2_null                    72.24ms     13.84
array_constructor_INTEGER_nulls##2_const                   71.13ms     14.06

After:

array_constructor_INTEGER_nullfree#facebookincubator#1                       3.39ms    294.89
array_constructor_INTEGER_nullfree#facebookincubator#2                       7.35ms    136.10
array_constructor_INTEGER_nullfree#facebookincubator#3                      10.78ms     92.74
array_constructor_INTEGER_nullfree##2_null                 11.29ms     88.57
array_constructor_INTEGER_nullfree##2_const                10.14ms     98.65
array_constructor_INTEGER_nulls#facebookincubator#1                          4.49ms    222.53
array_constructor_INTEGER_nulls#facebookincubator#2                          9.78ms    102.29
array_constructor_INTEGER_nulls#facebookincubator#3                         14.69ms     68.08
array_constructor_INTEGER_nulls##2_null                    12.14ms     82.36
array_constructor_INTEGER_nulls##2_const                   12.27ms     81.53

Before:

array_constructor_MAP_nullfree#facebookincubator#1                          17.34ms     57.65
array_constructor_MAP_nullfree#facebookincubator#2                          29.84ms     33.51
array_constructor_MAP_nullfree#facebookincubator#3                          41.51ms     24.09
array_constructor_MAP_nullfree##2_null                     56.57ms     17.68
array_constructor_MAP_nullfree##2_const                    71.68ms     13.95
array_constructor_MAP_nulls#facebookincubator#1                             36.22ms     27.61
array_constructor_MAP_nulls#facebookincubator#2                             68.18ms     14.67
array_constructor_MAP_nulls#facebookincubator#3                             95.12ms     10.51
array_constructor_MAP_nulls##2_null                        86.42ms     11.57
array_constructor_MAP_nulls##2_const                      120.10ms      8.33

After:

array_constructor_MAP_nullfree#facebookincubator#1                          17.05ms     58.66
array_constructor_MAP_nullfree#facebookincubator#2                          28.42ms     35.18
array_constructor_MAP_nullfree#facebookincubator#3                          36.96ms     27.06
array_constructor_MAP_nullfree##2_null                     55.64ms     17.97
array_constructor_MAP_nullfree##2_const                    67.53ms     14.81
array_constructor_MAP_nulls#facebookincubator#1                             32.91ms     30.39
array_constructor_MAP_nulls#facebookincubator#2                             64.50ms     15.50
array_constructor_MAP_nulls#facebookincubator#3                             95.71ms     10.45
array_constructor_MAP_nulls##2_null                        77.22ms     12.95
array_constructor_MAP_nulls##2_const                      114.91ms      8.70

Before:

array_constructor_ROW_nullfree#facebookincubator#1                          33.88ms     29.52
array_constructor_ROW_nullfree#facebookincubator#2                          62.00ms     16.13
array_constructor_ROW_nullfree#facebookincubator#3                          89.54ms     11.17
array_constructor_ROW_nullfree##2_null                     78.46ms     12.75
array_constructor_ROW_nullfree##2_const                    95.53ms     10.47
array_constructor_ROW_nulls#facebookincubator#1                             44.11ms     22.67
array_constructor_ROW_nulls#facebookincubator#2                            115.43ms      8.66
array_constructor_ROW_nulls#facebookincubator#3                            173.61ms      5.76
array_constructor_ROW_nulls##2_null                       130.40ms      7.67
array_constructor_ROW_nulls##2_const                      169.97ms      5.88

After:

array_constructor_ROW_nullfree#facebookincubator#1                           5.55ms    180.15
array_constructor_ROW_nullfree#facebookincubator#2                          12.83ms     77.94
array_constructor_ROW_nullfree#facebookincubator#3                          18.89ms     52.95
array_constructor_ROW_nullfree##2_null                     18.74ms     53.36
array_constructor_ROW_nullfree##2_const                    18.16ms     55.07
array_constructor_ROW_nulls#facebookincubator#1                             11.29ms     88.61
array_constructor_ROW_nulls#facebookincubator#2                             18.57ms     53.86
array_constructor_ROW_nulls#facebookincubator#3                             34.20ms     29.24
array_constructor_ROW_nulls##2_null                        25.05ms     39.92
array_constructor_ROW_nulls##2_const                       25.15ms     39.77
```

Reviewed By: laithsakka

Differential Revision: D49272797

fbshipit-source-id: 55d83de7b69c7ae4b72b5a5ae62a7868f36b0e19
codyschierbeck pushed a commit to codyschierbeck/velox that referenced this pull request Sep 27, 2023
Summary:
Pull Request resolved: facebookincubator#6568

array_constructor is very slow: facebookincubator#5958 (comment)

array_constructor uses BaseVector::copyRanges, which is somewhat fast for arrays and maps, but very slow for primitive types:

```
FlatVector.h

  void copyRanges(
      const BaseVector* source,
      const folly::Range<const BaseVector::CopyRange*>& ranges) override {
    for (auto& range : ranges) {
      copy(source, range.targetIndex, range.sourceIndex, range.count);
    }
  }
```

FlatVector<T>::copy(source, rows, toSourceRow) is faster.

Switching from copyRanges to copy speeds up array_constructor for primitive types and structs significantly. Yet, this change makes arrays and maps slower.

The slowness is due to ArrayVector and MapVector not having implementation for copy(source, rows, toSourceRow). They rely on BaseVector::copy to translate rows + toSourceRow to ranges. This extra processing causes perf regression.

Hence, we use copy for primitive types and structs of these and copyRanges for everything else.

```
Before:

array_constructor_ARRAY_nullfree#facebookincubator#1                        16.80ms     59.53
array_constructor_ARRAY_nullfree#facebookincubator#2                        27.02ms     37.01
array_constructor_ARRAY_nullfree#facebookincubator#3                        38.03ms     26.30
array_constructor_ARRAY_nullfree##2_null                   52.86ms     18.92
array_constructor_ARRAY_nullfree##2_const                  54.97ms     18.19
array_constructor_ARRAY_nulls#facebookincubator#1                           30.61ms     32.66
array_constructor_ARRAY_nulls#facebookincubator#2                           55.01ms     18.18
array_constructor_ARRAY_nulls#facebookincubator#3                           80.69ms     12.39
array_constructor_ARRAY_nulls##2_null                      69.10ms     14.47
array_constructor_ARRAY_nulls##2_const                    103.85ms      9.63

After:

array_constructor_ARRAY_nullfree#facebookincubator#1                        15.25ms     65.58
array_constructor_ARRAY_nullfree#facebookincubator#2                        25.11ms     39.82
array_constructor_ARRAY_nullfree#facebookincubator#3                        34.59ms     28.91
array_constructor_ARRAY_nullfree##2_null                   53.61ms     18.65
array_constructor_ARRAY_nullfree##2_const                  51.48ms     19.42
array_constructor_ARRAY_nulls#facebookincubator#1                           29.99ms     33.34
array_constructor_ARRAY_nulls#facebookincubator#2                           55.91ms     17.89
array_constructor_ARRAY_nulls#facebookincubator#3                           81.73ms     12.24
array_constructor_ARRAY_nulls##2_null                      66.97ms     14.93
array_constructor_ARRAY_nulls##2_const                     92.96ms     10.76

Before:

array_constructor_INTEGER_nullfree#facebookincubator#1                      19.72ms     50.71
array_constructor_INTEGER_nullfree#facebookincubator#2                      34.51ms     28.97
array_constructor_INTEGER_nullfree#facebookincubator#3                      47.95ms     20.86
array_constructor_INTEGER_nullfree##2_null                 58.68ms     17.04
array_constructor_INTEGER_nullfree##2_const                45.15ms     22.15
array_constructor_INTEGER_nulls#facebookincubator#1                         29.99ms     33.34
array_constructor_INTEGER_nulls#facebookincubator#2                         55.32ms     18.08
array_constructor_INTEGER_nulls#facebookincubator#3                         78.53ms     12.73
array_constructor_INTEGER_nulls##2_null                    72.24ms     13.84
array_constructor_INTEGER_nulls##2_const                   71.13ms     14.06

After:

array_constructor_INTEGER_nullfree#facebookincubator#1                       3.39ms    294.89
array_constructor_INTEGER_nullfree#facebookincubator#2                       7.35ms    136.10
array_constructor_INTEGER_nullfree#facebookincubator#3                      10.78ms     92.74
array_constructor_INTEGER_nullfree##2_null                 11.29ms     88.57
array_constructor_INTEGER_nullfree##2_const                10.14ms     98.65
array_constructor_INTEGER_nulls#facebookincubator#1                          4.49ms    222.53
array_constructor_INTEGER_nulls#facebookincubator#2                          9.78ms    102.29
array_constructor_INTEGER_nulls#facebookincubator#3                         14.69ms     68.08
array_constructor_INTEGER_nulls##2_null                    12.14ms     82.36
array_constructor_INTEGER_nulls##2_const                   12.27ms     81.53

Before:

array_constructor_MAP_nullfree#facebookincubator#1                          17.34ms     57.65
array_constructor_MAP_nullfree#facebookincubator#2                          29.84ms     33.51
array_constructor_MAP_nullfree#facebookincubator#3                          41.51ms     24.09
array_constructor_MAP_nullfree##2_null                     56.57ms     17.68
array_constructor_MAP_nullfree##2_const                    71.68ms     13.95
array_constructor_MAP_nulls#facebookincubator#1                             36.22ms     27.61
array_constructor_MAP_nulls#facebookincubator#2                             68.18ms     14.67
array_constructor_MAP_nulls#facebookincubator#3                             95.12ms     10.51
array_constructor_MAP_nulls##2_null                        86.42ms     11.57
array_constructor_MAP_nulls##2_const                      120.10ms      8.33

After:

array_constructor_MAP_nullfree#facebookincubator#1                          17.05ms     58.66
array_constructor_MAP_nullfree#facebookincubator#2                          28.42ms     35.18
array_constructor_MAP_nullfree#facebookincubator#3                          36.96ms     27.06
array_constructor_MAP_nullfree##2_null                     55.64ms     17.97
array_constructor_MAP_nullfree##2_const                    67.53ms     14.81
array_constructor_MAP_nulls#facebookincubator#1                             32.91ms     30.39
array_constructor_MAP_nulls#facebookincubator#2                             64.50ms     15.50
array_constructor_MAP_nulls#facebookincubator#3                             95.71ms     10.45
array_constructor_MAP_nulls##2_null                        77.22ms     12.95
array_constructor_MAP_nulls##2_const                      114.91ms      8.70

Before:

array_constructor_ROW_nullfree#facebookincubator#1                          33.88ms     29.52
array_constructor_ROW_nullfree#facebookincubator#2                          62.00ms     16.13
array_constructor_ROW_nullfree#facebookincubator#3                          89.54ms     11.17
array_constructor_ROW_nullfree##2_null                     78.46ms     12.75
array_constructor_ROW_nullfree##2_const                    95.53ms     10.47
array_constructor_ROW_nulls#facebookincubator#1                             44.11ms     22.67
array_constructor_ROW_nulls#facebookincubator#2                            115.43ms      8.66
array_constructor_ROW_nulls#facebookincubator#3                            173.61ms      5.76
array_constructor_ROW_nulls##2_null                       130.40ms      7.67
array_constructor_ROW_nulls##2_const                      169.97ms      5.88

After:

array_constructor_ROW_nullfree#facebookincubator#1                           5.55ms    180.15
array_constructor_ROW_nullfree#facebookincubator#2                          12.83ms     77.94
array_constructor_ROW_nullfree#facebookincubator#3                          18.89ms     52.95
array_constructor_ROW_nullfree##2_null                     18.74ms     53.36
array_constructor_ROW_nullfree##2_const                    18.16ms     55.07
array_constructor_ROW_nulls#facebookincubator#1                             11.29ms     88.61
array_constructor_ROW_nulls#facebookincubator#2                             18.57ms     53.86
array_constructor_ROW_nulls#facebookincubator#3                             34.20ms     29.24
array_constructor_ROW_nulls##2_null                        25.05ms     39.92
array_constructor_ROW_nulls##2_const                       25.15ms     39.77
```

Reviewed By: laithsakka

Differential Revision: D49272797

fbshipit-source-id: 55d83de7b69c7ae4b72b5a5ae62a7868f36b0e19
ericyuliu pushed a commit to ericyuliu/velox that referenced this pull request Oct 12, 2023
Summary:
Pull Request resolved: facebookincubator#6568

array_constructor is very slow: facebookincubator#5958 (comment)

array_constructor uses BaseVector::copyRanges, which is somewhat fast for arrays and maps, but very slow for primitive types:

```
FlatVector.h

  void copyRanges(
      const BaseVector* source,
      const folly::Range<const BaseVector::CopyRange*>& ranges) override {
    for (auto& range : ranges) {
      copy(source, range.targetIndex, range.sourceIndex, range.count);
    }
  }
```

FlatVector<T>::copy(source, rows, toSourceRow) is faster.

Switching from copyRanges to copy speeds up array_constructor for primitive types and structs significantly. Yet, this change makes arrays and maps slower.

The slowness is due to ArrayVector and MapVector not having implementation for copy(source, rows, toSourceRow). They rely on BaseVector::copy to translate rows + toSourceRow to ranges. This extra processing causes perf regression.

Hence, we use copy for primitive types and structs of these and copyRanges for everything else.

```
Before:

array_constructor_ARRAY_nullfree#facebookincubator#1                        16.80ms     59.53
array_constructor_ARRAY_nullfree#facebookincubator#2                        27.02ms     37.01
array_constructor_ARRAY_nullfree#facebookincubator#3                        38.03ms     26.30
array_constructor_ARRAY_nullfree##2_null                   52.86ms     18.92
array_constructor_ARRAY_nullfree##2_const                  54.97ms     18.19
array_constructor_ARRAY_nulls#facebookincubator#1                           30.61ms     32.66
array_constructor_ARRAY_nulls#facebookincubator#2                           55.01ms     18.18
array_constructor_ARRAY_nulls#facebookincubator#3                           80.69ms     12.39
array_constructor_ARRAY_nulls##2_null                      69.10ms     14.47
array_constructor_ARRAY_nulls##2_const                    103.85ms      9.63

After:

array_constructor_ARRAY_nullfree#facebookincubator#1                        15.25ms     65.58
array_constructor_ARRAY_nullfree#facebookincubator#2                        25.11ms     39.82
array_constructor_ARRAY_nullfree#facebookincubator#3                        34.59ms     28.91
array_constructor_ARRAY_nullfree##2_null                   53.61ms     18.65
array_constructor_ARRAY_nullfree##2_const                  51.48ms     19.42
array_constructor_ARRAY_nulls#facebookincubator#1                           29.99ms     33.34
array_constructor_ARRAY_nulls#facebookincubator#2                           55.91ms     17.89
array_constructor_ARRAY_nulls#facebookincubator#3                           81.73ms     12.24
array_constructor_ARRAY_nulls##2_null                      66.97ms     14.93
array_constructor_ARRAY_nulls##2_const                     92.96ms     10.76

Before:

array_constructor_INTEGER_nullfree#facebookincubator#1                      19.72ms     50.71
array_constructor_INTEGER_nullfree#facebookincubator#2                      34.51ms     28.97
array_constructor_INTEGER_nullfree#facebookincubator#3                      47.95ms     20.86
array_constructor_INTEGER_nullfree##2_null                 58.68ms     17.04
array_constructor_INTEGER_nullfree##2_const                45.15ms     22.15
array_constructor_INTEGER_nulls#facebookincubator#1                         29.99ms     33.34
array_constructor_INTEGER_nulls#facebookincubator#2                         55.32ms     18.08
array_constructor_INTEGER_nulls#facebookincubator#3                         78.53ms     12.73
array_constructor_INTEGER_nulls##2_null                    72.24ms     13.84
array_constructor_INTEGER_nulls##2_const                   71.13ms     14.06

After:

array_constructor_INTEGER_nullfree#facebookincubator#1                       3.39ms    294.89
array_constructor_INTEGER_nullfree#facebookincubator#2                       7.35ms    136.10
array_constructor_INTEGER_nullfree#facebookincubator#3                      10.78ms     92.74
array_constructor_INTEGER_nullfree##2_null                 11.29ms     88.57
array_constructor_INTEGER_nullfree##2_const                10.14ms     98.65
array_constructor_INTEGER_nulls#facebookincubator#1                          4.49ms    222.53
array_constructor_INTEGER_nulls#facebookincubator#2                          9.78ms    102.29
array_constructor_INTEGER_nulls#facebookincubator#3                         14.69ms     68.08
array_constructor_INTEGER_nulls##2_null                    12.14ms     82.36
array_constructor_INTEGER_nulls##2_const                   12.27ms     81.53

Before:

array_constructor_MAP_nullfree#facebookincubator#1                          17.34ms     57.65
array_constructor_MAP_nullfree#facebookincubator#2                          29.84ms     33.51
array_constructor_MAP_nullfree#facebookincubator#3                          41.51ms     24.09
array_constructor_MAP_nullfree##2_null                     56.57ms     17.68
array_constructor_MAP_nullfree##2_const                    71.68ms     13.95
array_constructor_MAP_nulls#facebookincubator#1                             36.22ms     27.61
array_constructor_MAP_nulls#facebookincubator#2                             68.18ms     14.67
array_constructor_MAP_nulls#facebookincubator#3                             95.12ms     10.51
array_constructor_MAP_nulls##2_null                        86.42ms     11.57
array_constructor_MAP_nulls##2_const                      120.10ms      8.33

After:

array_constructor_MAP_nullfree#facebookincubator#1                          17.05ms     58.66
array_constructor_MAP_nullfree#facebookincubator#2                          28.42ms     35.18
array_constructor_MAP_nullfree#facebookincubator#3                          36.96ms     27.06
array_constructor_MAP_nullfree##2_null                     55.64ms     17.97
array_constructor_MAP_nullfree##2_const                    67.53ms     14.81
array_constructor_MAP_nulls#facebookincubator#1                             32.91ms     30.39
array_constructor_MAP_nulls#facebookincubator#2                             64.50ms     15.50
array_constructor_MAP_nulls#facebookincubator#3                             95.71ms     10.45
array_constructor_MAP_nulls##2_null                        77.22ms     12.95
array_constructor_MAP_nulls##2_const                      114.91ms      8.70

Before:

array_constructor_ROW_nullfree#facebookincubator#1                          33.88ms     29.52
array_constructor_ROW_nullfree#facebookincubator#2                          62.00ms     16.13
array_constructor_ROW_nullfree#facebookincubator#3                          89.54ms     11.17
array_constructor_ROW_nullfree##2_null                     78.46ms     12.75
array_constructor_ROW_nullfree##2_const                    95.53ms     10.47
array_constructor_ROW_nulls#facebookincubator#1                             44.11ms     22.67
array_constructor_ROW_nulls#facebookincubator#2                            115.43ms      8.66
array_constructor_ROW_nulls#facebookincubator#3                            173.61ms      5.76
array_constructor_ROW_nulls##2_null                       130.40ms      7.67
array_constructor_ROW_nulls##2_const                      169.97ms      5.88

After:

array_constructor_ROW_nullfree#facebookincubator#1                           5.55ms    180.15
array_constructor_ROW_nullfree#facebookincubator#2                          12.83ms     77.94
array_constructor_ROW_nullfree#facebookincubator#3                          18.89ms     52.95
array_constructor_ROW_nullfree##2_null                     18.74ms     53.36
array_constructor_ROW_nullfree##2_const                    18.16ms     55.07
array_constructor_ROW_nulls#facebookincubator#1                             11.29ms     88.61
array_constructor_ROW_nulls#facebookincubator#2                             18.57ms     53.86
array_constructor_ROW_nulls#facebookincubator#3                             34.20ms     29.24
array_constructor_ROW_nulls##2_null                        25.05ms     39.92
array_constructor_ROW_nulls##2_const                       25.15ms     39.77
```

Reviewed By: laithsakka

Differential Revision: D49272797

fbshipit-source-id: 55d83de7b69c7ae4b72b5a5ae62a7868f36b0e19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants