refactor: convert `like_dyn`, `like_utf8_scalar_dyn` and `*like_dict` functions to macros #3411

askoa · 2022-12-29T22:35:36Z

Which issue does this PR close?

Part of #3296

What changes are included in this PR?

Three declarative macro added to generate 12 functions below.

*like_dyn functions: like_dyn, nlike_dyn, ilike_dyn and nilike_dyn.
*like_utf8_scalar_dyn functions: like_utf8_scalar_dyn, nlike_utf8_scalar_dyn, ilike_utf8_scalar_dyn and nilike_utf8_scalar_dyn.
*like_dict functions: like_dict, nlike_dict, ilike_dict and nilike_dict.

Are there any user-facing changes?

No.

arrow-string/src/like.rs

askoa · 2022-12-29T22:46:42Z

Edit: The benchmarks below are for trait based commit. They do not apply to the current macro based commit.

No significant changes in benchmarks

ilike_utf8 scalar equals
                        time:   [243.40 µs 244.81 µs 246.26 µs]
                        change: [-3.1756% -2.1104% -1.0141%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

Benchmarking ilike_utf8 scalar contains: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.5s, enable flat sampling, or reduce sample count to 50.
ilike_utf8 scalar contains
                        time:   [1.8345 ms 1.8541 ms 1.8767 ms]
                        change: [-2.4958% -0.9651% +0.5636%] (p = 0.22 > 0.05)
                        No change in performance detected.

ilike_utf8 scalar ends with
                        time:   [245.18 µs 247.24 µs 249.71 µs]
                        change: [+1.6417% +3.0126% +4.4299%] (p = 0.00 < 0.05)
                        Performance has regressed.

ilike_utf8 scalar starts with
                        time:   [270.07 µs 272.10 µs 274.49 µs]
                        change: [-0.7759% +0.0457% +0.8078%] (p = 0.91 > 0.05)
                        No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) high mild
  8 (8.00%) high severe

Benchmarking ilike_utf8 scalar complex: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 10.0s, enable flat sampling, or reduce sample count to 40.
ilike_utf8 scalar complex
                        time:   [1.8412 ms 1.8614 ms 1.8841 ms]
                        change: [-3.4178% -2.0494% -0.6609%] (p = 0.01 < 0.05)
                        Change within noise threshold.
Found 17 outliers among 100 measurements (17.00%)
  3 (3.00%) high mild
  14 (14.00%) high severe

nilike_utf8 scalar equals
                        time:   [248.11 µs 249.37 µs 250.82 µs]
                        change: [-3.1327% -2.3248% -1.5422%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) high mild
  4 (4.00%) high severe

Benchmarking nilike_utf8 scalar contains: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.6s, enable flat sampling, or reduce sample count to 50.
nilike_utf8 scalar contains
                        time:   [1.8652 ms 1.8865 ms 1.9081 ms]
                        change: [-1.0305% +0.3844% +1.8825%] (p = 0.61 > 0.05)
                        No change in performance detected.

nilike_utf8 scalar ends with
                        time:   [293.38 µs 297.60 µs 302.06 µs]
                        change: [+1.2162% +2.5529% +3.8868%] (p = 0.00 < 0.05)
                        Performance has regressed.

nilike_utf8 scalar starts with
                        time:   [277.38 µs 280.41 µs 283.75 µs]
                        change: [-2.1562% -1.4000% -0.5886%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 10 outliers among 100 measurements (10.00%)
  3 (3.00%) high mild
  7 (7.00%) high severe

Benchmarking nilike_utf8 scalar complex: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.6s, enable flat sampling, or reduce sample count to 50.
nilike_utf8 scalar complex
                        time:   [1.8655 ms 1.8859 ms 1.9085 ms]
                        change: [-1.0484% +0.1324% +1.3700%] (p = 0.83 > 0.05)
                        No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

ilike_utf8_scalar_dyn dictionary[10] string[4])
                        time:   [83.185 µs 83.797 µs 84.536 µs]
                        change: [-1.7924% -1.1183% -0.4851%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high severe

tustvold

I wonder if this would be simpler as a declarative macro? Whilst I'm generally not a massive fan of them, I think they make sense in this case?

askoa · 2022-12-30T15:17:48Z

@tustvold not written a declarative macro so far. Let me look it up.

Edit: I cannot instinctively say if declarative macro is better than current approach. I can update this PR with declarative macro and see which one is better.

tustvold

Just a minor documentation nit, looks good thank you 😄

tustvold · 2022-12-31T16:41:11Z

arrow-string/src/like.rs

+/// [`LargeStringArray`], or [`DictionaryArray`] with values
+/// [`StringArray`]/[`LargeStringArray`].
+///
+/// See the documentation on [`like_utf8`] for more details.


I think this should be ilike_utf8 for the ilike kernels. How about documenting the functions outside the macros?

e.g.

/// See the documentation on [`like_utf8`] for more details. dyn_function!(like_dyn, like_utf8, like_dict);

The *like_utf8 name is already captured in the macro. So I just added a document line. I just don't see the benefits of comment outside the macro vs inside it. So I left the comments inside the macro.

I changed the documentation back to like_utf8 as I realized it is on purpose. The current code points to like_utf8 in all functions as its the only function with an example.

ursabot · 2023-01-02T10:32:34Z

Benchmark runs are scheduled for baseline = 2408bb2 and contender = b371f41. b371f41 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

github-actions bot added the arrow Changes to the arrow crate label Dec 29, 2022

askoa commented Dec 29, 2022

View reviewed changes

arrow-string/src/like.rs Outdated Show resolved Hide resolved

tustvold reviewed Dec 30, 2022

View reviewed changes

askoa marked this pull request as draft December 30, 2022 15:27

askoa changed the title ~~refactor: *like_dyn refactor~~ refactor: *like_dyn refactor Dec 30, 2022

askoa marked this pull request as ready for review December 30, 2022 21:11

tustvold approved these changes Dec 31, 2022

View reviewed changes

viirya approved these changes Dec 31, 2022

View reviewed changes

askoa added 5 commits December 31, 2022 19:27

like_dyn refactor

9438c87

name changes to better match existing names

c212a33

Use declarative macro for *like_dyn functions

96624e1

fix PR comment

1e22f15

empty commit

f8af9ff

askoa marked this pull request as draft January 1, 2023 17:47

askoa changed the title ~~refactor: *like_dyn refactor~~ refactor: convert *like_dyn, *like_utf8_scalar_dyn and *like_dict functions to macros Jan 1, 2023

macros for *like_utf8_scalar_dyn and *like_dict functions

ba32d14

askoa marked this pull request as ready for review January 1, 2023 18:13

askoa requested review from tustvold and viirya and removed request for tustvold January 1, 2023 18:27

tustvold merged commit b371f41 into apache:master Jan 2, 2023

askoa deleted the like-refactor branch January 12, 2023 23:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: convert `like_dyn`, `like_utf8_scalar_dyn` and `*like_dict` functions to macros #3411

refactor: convert `like_dyn`, `like_utf8_scalar_dyn` and `*like_dict` functions to macros #3411

askoa commented Dec 29, 2022 •

edited

Loading

askoa commented Dec 29, 2022 •

edited

Loading

tustvold left a comment •

edited

Loading

askoa commented Dec 30, 2022 •

edited

Loading

tustvold left a comment

tustvold Dec 31, 2022

askoa Jan 1, 2023

askoa Jan 1, 2023

ursabot commented Jan 2, 2023

refactor: convert *like_dyn, *like_utf8_scalar_dyn and *like_dict functions to macros #3411

refactor: convert *like_dyn, *like_utf8_scalar_dyn and *like_dict functions to macros #3411

Conversation

askoa commented Dec 29, 2022 • edited Loading

Which issue does this PR close?

What changes are included in this PR?

Are there any user-facing changes?

askoa commented Dec 29, 2022 • edited Loading

tustvold left a comment • edited Loading

Choose a reason for hiding this comment

askoa commented Dec 30, 2022 • edited Loading

tustvold left a comment

Choose a reason for hiding this comment

tustvold Dec 31, 2022

Choose a reason for hiding this comment

askoa Jan 1, 2023

Choose a reason for hiding this comment

askoa Jan 1, 2023

Choose a reason for hiding this comment

ursabot commented Jan 2, 2023

refactor: convert `like_dyn`, `like_utf8_scalar_dyn` and `*like_dict` functions to macros #3411

refactor: convert `like_dyn`, `like_utf8_scalar_dyn` and `*like_dict` functions to macros #3411

askoa commented Dec 29, 2022 •

edited

Loading

askoa commented Dec 29, 2022 •

edited

Loading

tustvold left a comment •

edited

Loading

askoa commented Dec 30, 2022 •

edited

Loading