Skip to content

Conversation

@tobixdev
Copy link
Contributor

@tobixdev tobixdev commented Oct 3, 2025

Which issue does this PR close?

Rationale for this change

Fix the bug and align BooleanArray::from_iter to PrimitiveArray::from_iter

In BooleanArray::from_iter:
Collecting to a Vec and then using from_trusted_len_iter was almost double as fast as using BooleanBufferBuilder on my machine.

What changes are included in this PR?

  • Use builders in BooleanArray::from_iter to fix the wrong behavior
  • Introduce BooleanArray::from_trusted_len_iter for a more performant version (The old version of BooleanArray::from_iter, just with unsafe flavor of bit_util::set_bit_raw)
  • Add BooleanAdapter, inspired by NativeAdapter from the PrimitiveArray. This allows also doing BooleanArray::from_iter([true, false].into_iter()).

Are these changes tested?

  • New test to cover the initial bug
  • New test to cover BooleanArray::from_trusted_len_iter directly (old BooleanArray::from_iter also cover it indirectly)
  • New test to document that you can directly collect [false, true, ...] (no Option)

Are there any user-facing changes?

  • BooleanArray::from_iter has a "slight" performance regression that users could observe.
  • Allow directly collecting bools to a BooleanArray
  • BooleanArray::from_trusted_len_iter

…ed len iterators

Use BooleanBuilder in FromIterator

Add BooleanAdapter

Add BooleanArray::from_trusted_len_iter
@github-actions github-actions bot added the arrow Changes to the arrow crate label Oct 3, 2025
@tobixdev
Copy link
Contributor Author

tobixdev commented Oct 3, 2025

@alamb Could you run the array_from benchmarks? Thanks!

Copy link
Member

@mbrobbel mbrobbel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my M1 Pro:

BooleanArray::from_iter
[26.169 µs 26.194 µs 26.222 µs]

BooleanArray::from_trusted_len_iter
[9.6375 µs 9.6519 µs 9.6659 µs]

@tobixdev
Copy link
Contributor Author

tobixdev commented Oct 3, 2025

On my M1 Pro:

BooleanArray::from_iter
[26.169 µs 26.194 µs 26.222 µs]

BooleanArray::from_trusted_len_iter
[9.6375 µs 9.6519 µs 9.6659 µs]

@mbrobbel Thanks for the review and running the benchmarks! Interessting how the performance differs on an ARM chip. Here are my numbers (Ryzen 3900X).

BooleanArray::from_iter time:   [16.047 µs 16.097 µs 16.154 µs]
BooleanArray::from_trusted_len_iter:   [13.108 µs 13.140 µs 13.176 µs]

@jhorstmann
Copy link
Contributor

I think the benchmark results for from_iter might be a bit overly optimistic, the input in the benchmark seems to be already be in an Vec<Option<bool>>, the rust compiler would be able to optimize the intermediate vector allocation away. The performance for an arbitrary iterator might be worse. An implementation using the existing BooleanBuilder might be preferrable for the generic case, and users with an trusted iterator implementation should be directed to the new unsafe implementation.

impl<Ptr: std::borrow::Borrow<Option<bool>>> FromIterator<Ptr> for BooleanArray {
    fn from_iter<I: IntoIterator<Item = Ptr>>(iter: I) -> Self {
        let iter = iter.into_iter();
        let capacity = match iter.size_hint() {
            (lower, Some(upper)) if lower == upper => lower,
            _ => 0
        };
        let mut builder = BooleanBuilder::with_capacity(capacity);
        builder.extend(iter.map(|item| *item.borrow()));
        builder.finish()
    }
}

This is probably much slower than the current incorrect implementation, but any optimizations to BooleanBuilder would then also apply to this implementation.

@tobixdev tobixdev force-pushed the 8505-fix-from-iter branch 2 times, most recently from 39af718 to fb8626a Compare October 3, 2025 11:47
@tobixdev tobixdev force-pushed the 8505-fix-from-iter branch from fb8626a to f7bd3c5 Compare October 3, 2025 11:51
@tobixdev
Copy link
Contributor Author

tobixdev commented Oct 3, 2025

@jhorstmann Thanks for your input. That was a flaw in my benchmarks you're right. I've adapted them to now use a Box<dyn Iterator>, such that we can hide the "iterator type information" from the compiler (at least I think). I think this also better captures the use case.

I really like the idea of re-using BooleanBuilder. However, collecting to a Vec and then using from_trusted_len_iter is more efficient on my machine at least (at the cost of allocating a Vec). What if we move this into the Extend implementation of the BooleanBuilder?

I've adapted the implementation so you can take a look.

@tobixdev
Copy link
Contributor Author

tobixdev commented Oct 3, 2025

With the new Extend:

BooleanArray::from_iter:   [46.030 µs 46.107 µs 46.196 µs]
BooleanArray::from_trusted_len_iter:   [39.073 µs 39.508 µs 39.973 µs]

With the old Extend:

BooleanArray::from_iter:   [76.143 µs 76.189 µs 76.247 µs]
BooleanArray::from_trusted_len_iter:   [40.139 µs 40.358 µs 40.620 µs]

@alamb
Copy link
Contributor

alamb commented Oct 3, 2025

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.14.0-1016-gcp #17~24.04.1-Ubuntu SMP Wed Sep 3 01:55:36 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing 8505-fix-from-iter (9ba0eeb) to f88921c diff
BENCH_NAME=array_from
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench array_from
BENCH_FILTER=
BENCH_BRANCH_NAME=8505-fix-from-iter
Results will be posted here when complete

@alamb
Copy link
Contributor

alamb commented Oct 3, 2025

🤖: Benchmark completed

Details

group                                  8505-fix-from-iter                     main
-----                                  ------------------                     ----
BooleanArray::from_iter                3.01     59.1±0.51µs        ? ?/sec    1.00     19.6±0.08µs        ? ?/sec
BooleanArray::from_trusted_len_iter    1.00     40.0±0.12µs        ? ?/sec  
Int64Array::from_iter                  2.29     91.4±0.80µs        ? ?/sec    1.00     39.9±0.08µs        ? ?/sec
Int64Array::from_trusted_len_iter      2.77     52.4±0.46µs        ? ?/sec    1.00     18.9±0.03µs        ? ?/sec
array_from_vec 128                     1.00    156.7±0.26ns        ? ?/sec    1.01    158.0±0.35ns        ? ?/sec
array_from_vec 256                     1.00    165.5±0.33ns        ? ?/sec    1.00    165.9±0.33ns        ? ?/sec
array_from_vec 512                     1.00    223.0±0.53ns        ? ?/sec    1.00    223.6±0.86ns        ? ?/sec
array_string_from_vec 128              1.00   1249.3±1.66ns        ? ?/sec    1.00   1254.3±4.35ns        ? ?/sec
array_string_from_vec 256              1.00  1985.8±21.82ns        ? ?/sec    1.03      2.0±0.00µs        ? ?/sec
array_string_from_vec 512              1.00      3.5±0.03µs        ? ?/sec    1.01      3.5±0.00µs        ? ?/sec
decimal128_array_from_vec 32768        1.00     98.9±1.26µs        ? ?/sec    1.00     99.3±0.26µs        ? ?/sec
decimal256_array_from_vec 32768        1.00      3.8±0.03µs        ? ?/sec    1.02      3.9±0.02µs        ? ?/sec
decimal32_array_from_vec 32768         1.01     86.0±0.23µs        ? ?/sec    1.00     85.5±0.15µs        ? ?/sec
decimal64_array_from_vec 32768         1.00     90.8±0.40µs        ? ?/sec    1.00     90.9±0.15µs        ? ?/sec
struct_array_from_vec 1024             1.00      8.5±0.03µs        ? ?/sec    1.05      8.9±0.02µs        ? ?/sec
struct_array_from_vec 128              1.00   1836.0±8.72ns        ? ?/sec    1.04   1914.3±5.43ns        ? ?/sec
struct_array_from_vec 256              1.00      2.7±0.03µs        ? ?/sec    1.08      2.9±0.01µs        ? ?/sec
struct_array_from_vec 512              1.00      4.5±0.05µs        ? ?/sec    1.08      4.9±0.01µs        ? ?/sec

@alamb
Copy link
Contributor

alamb commented Oct 3, 2025

🤔 The only one that looks suspicious is Int64Array::from_trusted_len_ter getting significantly slower

I will rerun to double check

Int64Array::from_trusted_len_iter      2.77     52.4±0.46µs        ? ?/sec    1.00     18.9±0.03µs        ? ?/sec

@alamb
Copy link
Contributor

alamb commented Oct 3, 2025

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.14.0-1016-gcp #17~24.04.1-Ubuntu SMP Wed Sep 3 01:55:36 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing 8505-fix-from-iter (9ba0eeb) to f88921c diff
BENCH_NAME=array_from
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench array_from
BENCH_FILTER=
BENCH_BRANCH_NAME=8505-fix-from-iter
Results will be posted here when complete

@alamb
Copy link
Contributor

alamb commented Oct 3, 2025

🤖: Benchmark completed

Details

group                                  8505-fix-from-iter                     main
-----                                  ------------------                     ----
BooleanArray::from_iter                3.01     59.1±0.38µs        ? ?/sec    1.00     19.6±0.03µs        ? ?/sec
BooleanArray::from_trusted_len_iter    1.00     40.0±0.03µs        ? ?/sec  
Int64Array::from_iter                  2.29     91.4±0.21µs        ? ?/sec    1.00     40.0±0.09µs        ? ?/sec
Int64Array::from_trusted_len_iter      2.78     52.5±0.34µs        ? ?/sec    1.00     18.9±0.04µs        ? ?/sec
array_from_vec 128                     1.00    162.1±0.51ns        ? ?/sec    1.06    171.7±0.28ns        ? ?/sec
array_from_vec 256                     1.00    169.9±0.39ns        ? ?/sec    1.05    179.2±0.16ns        ? ?/sec
array_from_vec 512                     1.00    228.2±0.57ns        ? ?/sec    1.04    237.8±0.28ns        ? ?/sec
array_string_from_vec 128              1.00   1256.8±3.36ns        ? ?/sec    1.00   1252.9±1.49ns        ? ?/sec
array_string_from_vec 256              1.00   1985.9±4.60ns        ? ?/sec    1.03      2.0±0.00µs        ? ?/sec
array_string_from_vec 512              1.00      3.5±0.01µs        ? ?/sec    1.01      3.5±0.00µs        ? ?/sec
decimal128_array_from_vec 32768        1.00     98.7±0.58µs        ? ?/sec    1.01    100.0±0.23µs        ? ?/sec
decimal256_array_from_vec 32768        1.00      3.9±0.01µs        ? ?/sec    1.01      3.9±0.01µs        ? ?/sec
decimal32_array_from_vec 32768         1.00     85.9±0.60µs        ? ?/sec    1.00     85.5±0.12µs        ? ?/sec
decimal64_array_from_vec 32768         1.00     90.4±0.19µs        ? ?/sec    1.00     90.7±0.27µs        ? ?/sec
struct_array_from_vec 1024             1.00      8.6±0.02µs        ? ?/sec    1.03      8.9±0.01µs        ? ?/sec
struct_array_from_vec 128              1.00   1862.6±2.69ns        ? ?/sec    1.02   1905.5±5.29ns        ? ?/sec
struct_array_from_vec 256              1.00      2.8±0.02µs        ? ?/sec    1.05      2.9±0.00µs        ? ?/sec
struct_array_from_vec 512              1.00      4.6±0.01µs        ? ?/sec    1.05      4.8±0.01µs        ? ?/sec

@alamb
Copy link
Contributor

alamb commented Oct 3, 2025

🤔

Int64Array::from_trusted_len_iter 2.78 52.5±0.34µs ? ?/sec 1.00 18.9±0.04µs ? ?/sec

@alamb
Copy link
Contributor

alamb commented Oct 6, 2025

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.14.0-1016-gcp #17~24.04.1-Ubuntu SMP Wed Sep 3 01:55:36 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing 8505-fix-from-iter (16c4059) to f88921c diff
BENCH_NAME=array_from
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench array_from
BENCH_FILTER=
BENCH_BRANCH_NAME=8505-fix-from-iter
Results will be posted here when complete

@alamb
Copy link
Contributor

alamb commented Oct 6, 2025

🤖: Benchmark completed

Details

group                                  8505-fix-from-iter                     main
-----                                  ------------------                     ----
BooleanArray::from_iter                1.22     24.0±0.04µs        ? ?/sec    1.00     19.6±0.02µs        ? ?/sec
BooleanArray::from_trusted_len_iter    1.00     16.9±0.03µs        ? ?/sec  
Int64Array::from_iter                  1.00     39.8±0.08µs        ? ?/sec    1.00     40.0±0.14µs        ? ?/sec
Int64Array::from_trusted_len_iter      1.01     19.0±0.05µs        ? ?/sec    1.00     18.9±0.05µs        ? ?/sec
array_from_vec 128                     1.00    156.7±0.15ns        ? ?/sec    1.02    159.3±0.19ns        ? ?/sec
array_from_vec 256                     1.00    165.6±0.82ns        ? ?/sec    1.01    166.9±0.39ns        ? ?/sec
array_from_vec 512                     1.00    222.4±0.53ns        ? ?/sec    1.01    224.7±0.30ns        ? ?/sec
array_string_from_vec 128              1.00   1033.8±1.72ns        ? ?/sec    1.21   1249.9±4.58ns        ? ?/sec
array_string_from_vec 256              1.00   1766.5±5.77ns        ? ?/sec    1.15      2.0±0.00µs        ? ?/sec
array_string_from_vec 512              1.00      3.1±0.01µs        ? ?/sec    1.14      3.5±0.02µs        ? ?/sec
decimal128_array_from_vec 32768        1.00     99.7±0.31µs        ? ?/sec    1.00     99.4±0.44µs        ? ?/sec
decimal256_array_from_vec 32768        1.03      4.0±0.02µs        ? ?/sec    1.00      3.8±0.02µs        ? ?/sec
decimal32_array_from_vec 32768         1.00     85.7±0.38µs        ? ?/sec    1.00     85.7±0.17µs        ? ?/sec
decimal64_array_from_vec 32768         1.00     90.6±0.23µs        ? ?/sec    1.00     90.7±0.22µs        ? ?/sec
struct_array_from_vec 1024             1.00      8.2±0.01µs        ? ?/sec    1.08      8.9±0.12µs        ? ?/sec
struct_array_from_vec 128              1.00   1826.5±4.18ns        ? ?/sec    1.03   1887.2±3.92ns        ? ?/sec
struct_array_from_vec 256              1.00      2.8±0.00µs        ? ?/sec    1.05      2.9±0.01µs        ? ?/sec
struct_array_from_vec 512              1.00      4.6±0.01µs        ? ?/sec    1.06      4.9±0.01µs        ? ?/sec

@tobixdev
Copy link
Contributor Author

tobixdev commented Oct 6, 2025

🤖: Benchmark completed
Details

group                                  8505-fix-from-iter                     main
-----                                  ------------------                     ----
BooleanArray::from_iter                1.22     24.0±0.04µs        ? ?/sec    1.00     19.6±0.02µs        ? ?/sec
BooleanArray::from_trusted_len_iter    1.00     16.9±0.03µs        ? ?/sec  
Int64Array::from_iter                  1.00     39.8±0.08µs        ? ?/sec    1.00     40.0±0.14µs        ? ?/sec
Int64Array::from_trusted_len_iter      1.01     19.0±0.05µs        ? ?/sec    1.00     18.9±0.05µs        ? ?/sec
[...]

Thanks! I think this is in-line with what I measured locally. BooleanArray::from_iter is a bit slower, but the new BooleanArray::from_trusted_len_iter is faster than the old implementation (here 16.9 µs vs 19.6 µs) due to the use of unsafe.

@alamb
Copy link
Contributor

alamb commented Oct 6, 2025

🤖: Benchmark completed
Details

group                                  8505-fix-from-iter                     main
-----                                  ------------------                     ----
BooleanArray::from_iter                1.22     24.0±0.04µs        ? ?/sec    1.00     19.6±0.02µs        ? ?/sec
BooleanArray::from_trusted_len_iter    1.00     16.9±0.03µs        ? ?/sec  
Int64Array::from_iter                  1.00     39.8±0.08µs        ? ?/sec    1.00     40.0±0.14µs        ? ?/sec
Int64Array::from_trusted_len_iter      1.01     19.0±0.05µs        ? ?/sec    1.00     18.9±0.05µs        ? ?/sec
[...]

Thanks! I think this is in-line with what I measured locally. BooleanArray::from_iter is a bit slower, but the new BooleanArray::from_trusted_len_iter is faster than the old implementation (here 16.9 µs vs 19.6 µs) due to the use of unsafe.

Yes, I agree with your analysis -- and I think it is ok to take a hit of performance to make it correct 🤣

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @tobixdev -- this code looks very nice and consistent with PrimitiveArray to me

@jhorstmann and @mbrobbel perhaps you can give this PR a final review before we merge as well to make sure we haven't missed anything

@jhorstmann
Copy link
Contributor

Looks good. This clearly fixes the incorrect behavior, and adding from_trusted_len_iter makes this consistent with the primitive arrays. I would prefer if we did not need the intermediate vec allocation, but will write my thoughts on improvements to BooleanBuilder in a separate issue.

@tobixdev
Copy link
Contributor Author

tobixdev commented Oct 7, 2025

Looks good. This clearly fixes the incorrect behavior, and adding from_trusted_len_iter makes this consistent with the primitive arrays. I would prefer if we did not need the intermediate vec allocation, but will write my thoughts on improvements to BooleanBuilder in a separate issue.

Yes, I agree that it would be great to get similar / better performance without any heap allocations. Thanks for creating the issue!

@tobixdev
Copy link
Contributor Author

tobixdev commented Oct 7, 2025

I think all comments are resolved and some of them will be tackled in the follow-up PR / issue. Thanks for your input!

@mbrobbel mbrobbel merged commit ba22a21 into apache:main Oct 7, 2025
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorrect Behavior of Collecting a filtered iterator to a BooleanArray

4 participants