Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: convert *like_dyn, *like_utf8_scalar_dyn and *like_dict functions to macros #3411

Merged
merged 6 commits into from
Jan 2, 2023
Merged

Conversation

askoa
Copy link
Contributor

@askoa askoa commented Dec 29, 2022

Which issue does this PR close?

Part of #3296

What changes are included in this PR?

Three declarative macro added to generate 12 functions below.

  • *like_dyn functions: like_dyn, nlike_dyn, ilike_dyn and nilike_dyn.
  • *like_utf8_scalar_dyn functions: like_utf8_scalar_dyn, nlike_utf8_scalar_dyn, ilike_utf8_scalar_dyn and nilike_utf8_scalar_dyn.
  • *like_dict functions: like_dict, nlike_dict, ilike_dict and nilike_dict.

Are there any user-facing changes?

No.

@github-actions github-actions bot added the arrow Changes to the arrow crate label Dec 29, 2022
@askoa
Copy link
Contributor Author

askoa commented Dec 29, 2022

Edit: The benchmarks below are for trait based commit. They do not apply to the current macro based commit.

No significant changes in benchmarks

ilike_utf8 scalar equals
                        time:   [243.40 µs 244.81 µs 246.26 µs]
                        change: [-3.1756% -2.1104% -1.0141%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

Benchmarking ilike_utf8 scalar contains: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.5s, enable flat sampling, or reduce sample count to 50.
ilike_utf8 scalar contains
                        time:   [1.8345 ms 1.8541 ms 1.8767 ms]
                        change: [-2.4958% -0.9651% +0.5636%] (p = 0.22 > 0.05)
                        No change in performance detected.

ilike_utf8 scalar ends with
                        time:   [245.18 µs 247.24 µs 249.71 µs]
                        change: [+1.6417% +3.0126% +4.4299%] (p = 0.00 < 0.05)
                        Performance has regressed.

ilike_utf8 scalar starts with
                        time:   [270.07 µs 272.10 µs 274.49 µs]
                        change: [-0.7759% +0.0457% +0.8078%] (p = 0.91 > 0.05)
                        No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) high mild
  8 (8.00%) high severe

Benchmarking ilike_utf8 scalar complex: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 10.0s, enable flat sampling, or reduce sample count to 40.
ilike_utf8 scalar complex
                        time:   [1.8412 ms 1.8614 ms 1.8841 ms]
                        change: [-3.4178% -2.0494% -0.6609%] (p = 0.01 < 0.05)
                        Change within noise threshold.
Found 17 outliers among 100 measurements (17.00%)
  3 (3.00%) high mild
  14 (14.00%) high severe

nilike_utf8 scalar equals
                        time:   [248.11 µs 249.37 µs 250.82 µs]
                        change: [-3.1327% -2.3248% -1.5422%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) high mild
  4 (4.00%) high severe

Benchmarking nilike_utf8 scalar contains: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.6s, enable flat sampling, or reduce sample count to 50.
nilike_utf8 scalar contains
                        time:   [1.8652 ms 1.8865 ms 1.9081 ms]
                        change: [-1.0305% +0.3844% +1.8825%] (p = 0.61 > 0.05)
                        No change in performance detected.

nilike_utf8 scalar ends with
                        time:   [293.38 µs 297.60 µs 302.06 µs]
                        change: [+1.2162% +2.5529% +3.8868%] (p = 0.00 < 0.05)
                        Performance has regressed.

nilike_utf8 scalar starts with
                        time:   [277.38 µs 280.41 µs 283.75 µs]
                        change: [-2.1562% -1.4000% -0.5886%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 10 outliers among 100 measurements (10.00%)
  3 (3.00%) high mild
  7 (7.00%) high severe

Benchmarking nilike_utf8 scalar complex: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.6s, enable flat sampling, or reduce sample count to 50.
nilike_utf8 scalar complex
                        time:   [1.8655 ms 1.8859 ms 1.9085 ms]
                        change: [-1.0484% +0.1324% +1.3700%] (p = 0.83 > 0.05)
                        No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

ilike_utf8_scalar_dyn dictionary[10] string[4])
                        time:   [83.185 µs 83.797 µs 84.536 µs]
                        change: [-1.7924% -1.1183% -0.4851%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high severe


Copy link
Contributor

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this would be simpler as a declarative macro? Whilst I'm generally not a massive fan of them, I think they make sense in this case?

@askoa
Copy link
Contributor Author

askoa commented Dec 30, 2022

@tustvold not written a declarative macro so far. Let me look it up.

Edit: I cannot instinctively say if declarative macro is better than current approach. I can update this PR with declarative macro and see which one is better.

@askoa askoa marked this pull request as draft December 30, 2022 15:27
@askoa askoa changed the title refactor: *like_dyn refactor refactor: *like_dyn refactor Dec 30, 2022
@askoa askoa marked this pull request as ready for review December 30, 2022 21:11
Copy link
Contributor

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a minor documentation nit, looks good thank you 😄

/// [`LargeStringArray`], or [`DictionaryArray`] with values
/// [`StringArray`]/[`LargeStringArray`].
///
/// See the documentation on [`like_utf8`] for more details.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be ilike_utf8 for the ilike kernels. How about documenting the functions outside the macros?

e.g.

/// See the documentation on [`like_utf8`] for more details.
dyn_function!(like_dyn, like_utf8, like_dict);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The *like_utf8 name is already captured in the macro. So I just added a document line. I just don't see the benefits of comment outside the macro vs inside it. So I left the comments inside the macro.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the documentation back to like_utf8 as I realized it is on purpose. The current code points to like_utf8 in all functions as its the only function with an example.

@askoa askoa marked this pull request as draft January 1, 2023 17:47
@askoa askoa changed the title refactor: *like_dyn refactor refactor: convert *like_dyn, *like_utf8_scalar_dyn and *like_dict functions to macros Jan 1, 2023
@askoa askoa marked this pull request as ready for review January 1, 2023 18:13
@askoa askoa requested review from tustvold and viirya and removed request for tustvold January 1, 2023 18:27
@tustvold tustvold merged commit b371f41 into apache:master Jan 2, 2023
@ursabot
Copy link

ursabot commented Jan 2, 2023

Benchmark runs are scheduled for baseline = 2408bb2 and contender = b371f41. b371f41 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@askoa askoa deleted the like-refactor branch January 12, 2023 23:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants