Skip to content

perf: Optimize initcap()#20352

Open
neilconway wants to merge 3 commits intoapache:mainfrom
neilconway:neilc/optimize-initcap
Open

perf: Optimize initcap()#20352
neilconway wants to merge 3 commits intoapache:mainfrom
neilconway:neilc/optimize-initcap

Conversation

@neilconway
Copy link
Contributor

@neilconway neilconway commented Feb 13, 2026

Which issue does this PR close?

Rationale for this change

When all values in a Utf8/LargeUtf8 array are ASCII, we can skip using GenericStringBuilder and instead process the entire input buffer in a single pass using byte-level operations. This also avoids recomputing the offsets and nulls arrays. A similar optimization is already used for lower() and upper().

Along the way, optimize initcap_string() for ASCII-only inputs. It already had an ASCII-only fastpath but there was room for further optimization, by iterating over bytes rather than characters.

What changes are included in this PR?

  • Cleanup benchmarks: we ran the scalar benchmark for different array sizes, despite the fact that it is invariant to the array size
  • Add benchmark for different string lengths
  • Add benchmark for Unicode array input
  • Optimize for ASCII-only inputs as described above
  • Add test case for ASCII-only input that is a sliced array

Are these changes tested?

Yes, plus an additional test added.

Are there any user-facing changes?

No.

@github-actions github-actions bot added the functions Changes to functions implementation label Feb 13, 2026
@neilconway
Copy link
Contributor Author

Benchmarks:

$ critcmp initcap-v2-vanilla initcap-v2-opt
group                                           initcap-v2-opt                         initcap-v2-vanilla
-----                                           --------------                         ------------------
initcap scalar/scalar_utf8                      1.00     99.3±0.82ns        ? ?/sec    1.53    151.7±0.87ns        ? ?/sec
initcap scalar/scalar_utf8view                  1.00     93.2±2.04ns        ? ?/sec    1.74    162.3±2.52ns        ? ?/sec
initcap size=1024 str_len=128/array_utf8        1.00     96.6±0.16µs        ? ?/sec    1.42    137.3±0.69µs        ? ?/sec
initcap size=1024 str_len=128/array_utf8view    1.00     73.2±0.05µs        ? ?/sec    1.81    132.3±0.24µs        ? ?/sec
initcap size=1024 str_len=16/array_utf8         1.00     13.6±0.02µs        ? ?/sec    1.55     21.0±0.06µs        ? ?/sec
initcap size=1024 str_len=16/array_utf8view     1.00     16.9±0.05µs        ? ?/sec    1.32     22.3±0.05µs        ? ?/sec
initcap size=4096 str_len=128/array_utf8        1.00    384.7±0.50µs        ? ?/sec    1.42    547.1±3.02µs        ? ?/sec
initcap size=4096 str_len=128/array_utf8view    1.00    291.6±0.28µs        ? ?/sec    1.82    529.4±3.35µs        ? ?/sec
initcap size=4096 str_len=16/array_utf8         1.00     52.9±0.07µs        ? ?/sec    1.56     82.5±0.14µs        ? ?/sec
initcap size=4096 str_len=16/array_utf8view     1.00     65.7±0.12µs        ? ?/sec    1.36     89.4±0.48µs        ? ?/sec
initcap size=8192 str_len=128/array_utf8        1.00    770.2±1.22µs        ? ?/sec    1.42   1096.4±2.68µs        ? ?/sec
initcap size=8192 str_len=128/array_utf8view    1.00    580.4±0.60µs        ? ?/sec    1.82   1059.1±5.74µs        ? ?/sec
initcap size=8192 str_len=16/array_utf8         1.00    105.4±0.38µs        ? ?/sec    1.58    166.3±0.41µs        ? ?/sec
initcap size=8192 str_len=16/array_utf8view     1.00    130.2±0.54µs        ? ?/sec    1.38    179.5±1.16µs        ? ?/sec

@neilconway neilconway force-pushed the neilc/optimize-initcap branch from 3c0885c to 2daa41d Compare February 15, 2026 12:25
We previously ran the `scalar` benchmarks for different array sizes,
despite the scalar benchmark code being invariant to array size.

Add a variant where we run `initcap` for different string lengths.
@neilconway neilconway force-pushed the neilc/optimize-initcap branch from 2daa41d to 5ccd135 Compare February 17, 2026 18:18
We already had an ASCII-only branch; we might as well take advantage of
it to iterate over bytes directly, which significantly increases
performance.
@neilconway neilconway force-pushed the neilc/optimize-initcap branch from 5ccd135 to 0e07b48 Compare February 17, 2026 18:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optimize initcap()

1 participant

Comments