perf: Optimize array_has() for scalar needle by neilconway · Pull Request #20374 · apache/datafusion

neilconway · 2026-02-15T21:12:36Z

Which issue does this PR close?

Closes Optimize array_has() for scalar needle #20377.

Rationale for this change

compare_with_eq() checks for matching array elements via a single pass across the entire flat values buffer, which is reasonably fast. The previous implementation then determined per-row results by creating a BooleanArray slice for each row and calling true_count() to check for any matches. It turns out that that's quite a lot of per-row work.

Instead, we use BooleanBuffer::set_indices() to iterate over the set bits in the comparison result in a single forward pass. We walk this iterator in lockstep with the row offsets to determine whether each row contains a match, which does much less work per-row.

This can be substantially faster, especially for short arrays. For example, for 10-element arrays of int64, it is 3-5x faster than the previous approach. 10-element string arrays are 1.6-4.8x faster. The improvement is smaller but non-zero for larger arrays (e.g., ~1.2x faster for 500 element arrays).

What changes are included in this PR?

In addition to the optimization, this commit adjusts the array_has benchmark code to actually benchmark array_has evaluation (!). The previous benchmark just constructed an Expr.

Are these changes tested?

Yes. Passes existing tests. Performance validated via several benchmark runs.

Are there any user-facing changes?

No.

The previous implementation tested the cost of building an array_has() `Expr` (!), not actually evaluating the array_has() operation itself. Refactor things along the way.

neilconway · 2026-02-16T00:29:56Z

Benchmarks:

  group                                       vanilla                                opt
  -----                                       ----                                   ------
  array_has_all/all_found_small_needle/10     1.00      4.6±0.23ms        ? ?/sec    1.00      4.6±0.04ms        ? ?/sec
  array_has_all/all_found_small_needle/100    1.00     11.2±0.12ms        ? ?/sec    1.01     11.4±0.09ms        ? ?/sec
  array_has_all/all_found_small_needle/500    1.01     46.2±0.58ms        ? ?/sec    1.00     45.8±1.09ms        ? ?/sec
  array_has_all/not_all_found/10              1.00      4.3±0.04ms        ? ?/sec    1.00      4.3±0.05ms        ? ?/sec
  array_has_all/not_all_found/100             1.00     10.3±0.20ms        ? ?/sec    1.02     10.5±0.06ms        ? ?/sec
  array_has_all/not_all_found/500             1.01     41.4±0.49ms        ? ?/sec    1.00     41.0±0.89ms        ? ?/sec
  array_has_all_strings/all_found/10          1.07      4.0±0.07ms        ? ?/sec    1.00      3.8±0.03ms        ? ?/sec
  array_has_all_strings/all_found/100         1.00     11.7±0.21ms        ? ?/sec    1.01     11.8±0.10ms        ? ?/sec
  array_has_all_strings/all_found/500         1.02     48.5±1.75ms        ? ?/sec    1.00     47.7±2.52ms        ? ?/sec
  array_has_all_strings/not_all_found/10      1.00      2.7±0.04ms        ? ?/sec    1.02      2.8±0.04ms        ? ?/sec
  array_has_all_strings/not_all_found/100     1.03     10.5±0.26ms        ? ?/sec    1.00     10.2±0.12ms        ? ?/sec
  array_has_all_strings/not_all_found/500     1.00     57.8±0.96ms        ? ?/sec    1.00     57.6±0.81ms        ? ?/sec
  array_has_any/no_match/10                   1.07      5.4±0.13ms        ? ?/sec    1.00      5.0±0.22ms        ? ?/sec
  array_has_any/no_match/100                  1.00     17.6±0.45ms        ? ?/sec    1.02     18.1±0.21ms        ? ?/sec
  array_has_any/no_match/500                  1.00     78.4±1.43ms        ? ?/sec    1.03     80.7±0.62ms        ? ?/sec
  array_has_any/some_match/10                 1.01      4.6±0.05ms        ? ?/sec    1.00      4.5±0.09ms        ? ?/sec
  array_has_any/some_match/100                1.00     10.9±0.10ms        ? ?/sec    1.03     11.2±0.15ms        ? ?/sec
  array_has_any/some_match/500                1.10     47.9±0.64ms        ? ?/sec    1.00     43.6±0.61ms        ? ?/sec
  array_has_any_strings/no_match/10           1.00      3.6±0.05ms        ? ?/sec    1.02      3.7±0.07ms        ? ?/sec
  array_has_any_strings/no_match/100          1.00     17.5±0.22ms        ? ?/sec    1.00     17.5±0.28ms        ? ?/sec
  array_has_any_strings/no_match/500          1.03    112.5±1.99ms        ? ?/sec    1.00    109.6±1.89ms        ? ?/sec
  array_has_any_strings/some_match/10         1.00      3.3±0.04ms        ? ?/sec    1.13      3.7±0.08ms        ? ?/sec
  array_has_any_strings/some_match/100        1.00     10.4±0.16ms        ? ?/sec    1.04     10.9±0.13ms        ? ?/sec
  array_has_any_strings/some_match/500        1.00     42.6±1.31ms        ? ?/sec    1.00     42.5±1.06ms        ? ?/sec
  array_has_i64/found/10                      3.14    516.1±8.76µs        ? ?/sec    1.00    164.1±4.76µs        ? ?/sec
  array_has_i64/found/100                     1.57  1043.2±25.75µs        ? ?/sec    1.00   666.3±15.72µs        ? ?/sec
  array_has_i64/found/500                     1.19      3.7±0.05ms        ? ?/sec    1.00      3.1±0.18ms        ? ?/sec
  array_has_i64/not_found/10                  5.27    514.7±4.70µs        ? ?/sec    1.00     97.7±3.40µs        ? ?/sec
  array_has_i64/not_found/100                 1.85  1035.2±11.34µs        ? ?/sec    1.00   559.5±17.33µs        ? ?/sec
  array_has_i64/not_found/500                 1.22      3.7±0.10ms        ? ?/sec    1.00      3.0±0.09ms        ? ?/sec
  array_has_strings/found/10                  1.61   996.1±13.42µs        ? ?/sec    1.00    618.1±6.67µs        ? ?/sec
  array_has_strings/found/100                 1.18      2.5±0.03ms        ? ?/sec    1.00      2.1±0.10ms        ? ?/sec
  array_has_strings/found/500                 1.13     10.3±0.82ms        ? ?/sec    1.00      9.1±0.80ms        ? ?/sec
  array_has_strings/not_found/10              4.82   550.1±33.51µs        ? ?/sec    1.00    114.2±3.77µs        ? ?/sec
  array_has_strings/not_found/100             1.15      5.3±0.06ms        ? ?/sec    1.00      4.6±0.13ms        ? ?/sec
  array_has_strings/not_found/500             1.05     14.1±0.22ms        ? ?/sec    1.00     13.4±0.43ms        ? ?/sec

neilconway added 2 commits February 15, 2026 15:56

Revise benchmark for array_has()

26e64ab

The previous implementation tested the cost of building an array_has() `Expr` (!), not actually evaluating the array_has() operation itself. Refactor things along the way.

Optimize array_has()

378cbef

getChan approved these changes Feb 16, 2026

View reviewed changes

claude bot mentioned this pull request Feb 18, 2026

20374: perf: Optimize array_has() for scalar needle martin-augment/datafusion#247

Open

neilconway mentioned this pull request Feb 18, 2026

perf: Optimize array_has_any() with scalar arg #20385

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: Optimize array_has() for scalar needle#20374

perf: Optimize array_has() for scalar needle#20374
neilconway wants to merge 2 commits intoapache:mainfrom
neilconway:neilc/optimize-array-has

neilconway commented Feb 15, 2026 •

edited

Loading

Uh oh!

neilconway commented Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

neilconway commented Feb 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

neilconway commented Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

neilconway commented Feb 15, 2026 •

edited

Loading