Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
106 commits
Select commit Hold shift + click to select a range
6ecd42b
Make filtered coalescing faster for primitive / byte types
Dandandan Dec 4, 2025
a8df36f
Make filtered coalescing faster for primitive types
Dandandan Dec 4, 2025
f20702b
Faster api
Dandandan Dec 4, 2025
124b4e3
Faster api
Dandandan Dec 4, 2025
79bd847
Faster api
Dandandan Dec 4, 2025
b2fc66f
Faster api
Dandandan Dec 4, 2025
0872a9b
Cleanup
Dandandan Dec 4, 2025
b7b3f18
Fix?
Dandandan Dec 4, 2025
7758889
optimize
Dandandan Dec 4, 2025
dcf4864
perf
Dandandan Dec 5, 2025
87626d1
comment
Dandandan Dec 5, 2025
7c46a72
Increase filter threshold
Dandandan Dec 6, 2025
6ee1f04
Adapt comment
Dandandan Dec 6, 2025
c39a455
More speed
Dandandan Dec 6, 2025
dc0c45e
Fmt
Dandandan Dec 6, 2025
d2b5d29
Don't collect
Dandandan Dec 6, 2025
18cf6fc
fix comments
Dandandan Dec 8, 2025
ddd0306
not unsafe
Dandandan Dec 9, 2025
82acfe1
not unsafe
Dandandan Dec 9, 2025
1acccc7
faster null handling
Dandandan Dec 9, 2025
bb025cf
not unsafe
Dandandan Dec 9, 2025
ca19422
Update arrow-select/src/filter.rs
Dandandan Dec 13, 2025
f718f2e
Update arrow-select/src/filter.rs
Dandandan Dec 13, 2025
b235243
Move / optimize
Dandandan Dec 15, 2025
ae995ba
Merge branch 'main' into coalesce_batches_filter
Dandandan Jan 8, 2026
e8919b1
docs: Update release schedule in README.md (#9111)
alamb Jan 8, 2026
46484ea
feat: add benchmarks for json parser (#9107)
Weijun-H Jan 8, 2026
3022aa6
chore: switch test from `bincode` to maintained `postcard` crate (RUS…
alamb Jan 8, 2026
4eb65a0
Speed up binary kernels (30% faster `and` and `or`), add `BooleanBuff…
alamb Jan 9, 2026
b2f9e42
[Variant] Optimize the object header generation logic in ObjectBuilde…
klion26 Jan 9, 2026
6bfd685
Update readme for geospatial crate (#9124)
paleolimbot Jan 9, 2026
0f994fa
Remove parquet arrow_cast dependency (#9077)
tustvold Jan 9, 2026
72c356a
Updated arrow-pyarrow to use pyo3 0.27, updated deprecated code warni…
hntd187 Jan 9, 2026
432b760
Fix clippy (#9130)
alamb Jan 10, 2026
a4dee8a
fix: display `0 secs` for empty DayTime/MonthDayNano intervals (#9023)
Jefffrey Jan 10, 2026
9927454
Fix IPC roundtripping dicts nested in ListViews (#9126)
brancz Jan 10, 2026
4a3ce6a
docs(parquet): add example for preserving dictionary encoding (#9116)
AndreaBozzo Jan 10, 2026
c587cf0
Add options to skip decoding `Statistics` and `SizeStatistics` in Par…
etseidl Jan 10, 2026
69dbab2
[arrow] Minimize allocation in GenericViewArray::slice() (#9016)
maxburke Jan 10, 2026
e28c305
bench: added to row_format benchmark conversion of 53 non-nested colu…
rluvaton Jan 10, 2026
077ad74
doc: fix link on FixedSizeListArray doc (#9033)
Jefffrey Jan 10, 2026
f7c430d
Docs: Add additional documentation and example for `make_array` (#9112)
alamb Jan 10, 2026
923c2b2
Change FlightSQLClient to return `FlightError` & cleanup code (#8916)
lewiszlw Jan 10, 2026
6d26fbc
Avoid overallocating arrays in coalesce primitives / views (#9132)
Dandandan Jan 10, 2026
91adb91
perf: optimize hex decoding in json (1.8x faster in binary-heavy) (#9…
Weijun-H Jan 10, 2026
d807503
Add nullif_kernel benchmark (#9089)
alamb Jan 11, 2026
db37aa1
perf: Avoid ArrayData allocation in PrimitiveArray::reinterpret_cast …
alamb Jan 11, 2026
6234ee0
Add `BooleanBufferBuilder::extend_trusted_len` (#9137)
Dandandan Jan 11, 2026
924e1fe
docs: Improve main README.md and highlight community (#9119)
alamb Jan 12, 2026
8485edf
fix: support cast from `Null` to list view/run encoded/union types (#…
Jefffrey Jan 12, 2026
9ee3cf1
docs(variant): fix VariantObject::get documentation to reflect Option…
mohit7705 Jan 12, 2026
684bc9f
Avoid clones in `make_array` for `StructArray` and `GenericByteViewAr…
alamb Jan 12, 2026
b41cd0d
Uncomment part of test_utf8_single_column_reader_test (#9148)
sdf-jkl Jan 13, 2026
66c1dae
Update ASF copyright year in NOTICE (#9145)
mohit7705 Jan 13, 2026
94317f7
fix:[9018]Fixed RunArray slice offsets (#9036)
manishkr Jan 13, 2026
717ed06
arrow-ipc: Add tests for nested dicts for Map and Union arrays (#9146)
brancz Jan 13, 2026
a064327
feat: add `reserve` to `Rows` (#9142)
rluvaton Jan 13, 2026
b8322ce
perf: improve calculating length performance for view byte array in r…
rluvaton Jan 13, 2026
ac1afae
[Parquet] perf: Create `PrimitiveArray`s directly rather than via `Ar…
alamb Jan 13, 2026
bf63ec5
doc: add example of RowFilter usage (#9115)
sonhmai Jan 13, 2026
5d5d1bf
[Parquet] perf: Create StructArrays directly rather than via `ArrayDa…
alamb Jan 13, 2026
9f43539
perf: improve field indexing in JSON StructArrayDecoder (1.7x speed u…
Weijun-H Jan 13, 2026
c3b76dc
Minor: try and avoid an allocation creating `GenericByteViewArray` fr…
alamb Jan 14, 2026
4bdccc2
Avoid a clone when creating `BooleanArray` from ArrayData (#9159)
alamb Jan 14, 2026
90c0a39
Avoid a clone when creating StringArray/BinaryArray from ArrayData (#…
alamb Jan 14, 2026
87fe9c8
perf: improve calculating length performance for `GenericByteArray` i…
rluvaton Jan 14, 2026
a9444d9
perf: improve calculating length performance for nested arrays in row…
rluvaton Jan 14, 2026
7d1223d
lint: remove unused function (fix clippy (#9178)
rluvaton Jan 14, 2026
91fe6f7
Add `TimestampWithOffset` canonical extension type (#8743)
serramatutu Jan 14, 2026
efeeded
feat: add null comparison handling in make_comparator (#9150)
Weijun-H Jan 14, 2026
b850d6f
docs: update examples in ArrowReaderOptions to use in-memory buffers …
AndreaBozzo Jan 14, 2026
2380b11
refactor: streamline date64 tests (#9165)
cht42 Jan 14, 2026
5678415
Improve `ArrowReaderBuilder::with_row_filter` documentation (#9153)
alamb Jan 14, 2026
4951e5b
add `#[inline]` to `BitIterator` `next` function (#9177)
rluvaton Jan 14, 2026
a4a49d1
Merge
Dandandan Jan 15, 2026
e78fa28
fix missing utf8 check for conversion from BinaryViewArray to StringV…
alamb Jan 14, 2026
e2ad508
Add find_nth_set_bit_position (#9151)
Dandandan Jan 15, 2026
414b3b9
Merge
Dandandan Jan 15, 2026
840653b
Merge
Dandandan Jan 15, 2026
a1d4097
Doc fix
Dandandan Jan 15, 2026
9773da7
Merge branch 'main' into coalesce_batches_filter
Dandandan Jan 16, 2026
bd65f64
Implement fallback copy_with_filter
alamb Jan 17, 2026
ec7ef9e
Merge pull request #13 from alamb/alamb/batches_filter_more
Dandandan Jan 17, 2026
e76a5d7
Fix nulls
Dandandan Jan 17, 2026
58190b8
Optimize
Dandandan Jan 17, 2026
be6b796
Use 0.5
Dandandan Jan 18, 2026
f0b98d6
Fix
Dandandan Jan 18, 2026
9aaca1f
Fix
Dandandan Jan 18, 2026
16f5d86
Merge branch 'main' into coalesce_batches_filter
alamb Jan 18, 2026
a27e4ab
Fix clippy
alamb Jan 18, 2026
6de89a3
Merge remote-tracking branch 'apache/main' into coalesce_batches_filter
alamb Jan 18, 2026
2b8711f
Avoid an Arc::clone
alamb Jan 19, 2026
12c526b
WIP byteview
Dandandan Jan 19, 2026
310404c
Merge pull request #16 from alamb/alamb/less_clone
Dandandan Jan 19, 2026
92b1b46
WIP byteview
Dandandan Jan 19, 2026
c9ae743
Revert "WIP byteview"
Dandandan Jan 19, 2026
5ae319f
WIP byteview
Dandandan Jan 19, 2026
bd4a1bc
Style
Dandandan Jan 19, 2026
99772cf
Remove slice for filterpredicate
Dandandan Jan 19, 2026
63443da
WIP
Dandandan Jan 20, 2026
b9efe0c
WIP
Dandandan Jan 20, 2026
cbaaf98
WIP
Dandandan Jan 20, 2026
0dbd786
WIP
Dandandan Jan 20, 2026
0a5a575
WIP
Dandandan Jan 20, 2026
3fbd560
Optimize
Dandandan Jan 20, 2026
d3cf7cb
Merge branch 'main' into coalesce_batches_filter
Dandandan Jan 20, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 70 additions & 0 deletions arrow-buffer/src/builder/null.rs
Original file line number Diff line number Diff line change
Expand Up @@ -196,6 +196,33 @@ impl NullBufferBuilder {
}
}

/// Extends this builder with validity values.
///
/// # Safety
/// The caller must ensure that the iterator reports the correct length.
///
/// # Example
/// ```
/// # use arrow_buffer::NullBufferBuilder;
/// let mut builder = NullBufferBuilder::new(8);
/// let validities = [true, false, true, true];
/// unsafe { builder.extend_trusted_len(validities.iter().copied()); }
/// assert_eq!(builder.len(), 4);
/// ```
pub unsafe fn extend_trusted_len<I: Iterator<Item = bool>>(&mut self, iter: I) {
// Materialize since we're about to append bits
self.materialize_if_needed();

unsafe {
self.bitmap_builder
.as_mut()
.unwrap()
.extend_trusted_len(iter)
};
}

/// Builds the null buffer and resets the builder.
/// Returns `None` if the builder only contains `true`s.
/// Builds the [`NullBuffer`] and resets the builder.
///
/// Returns `None` if the builder only contains `true`s. Use [`Self::build`]
Expand Down Expand Up @@ -412,4 +439,47 @@ mod tests {

assert_eq!(builder.finish(), None);
}

#[test]
fn test_extend() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is fine to have, but technically speaking should these paths be covered in the lower BooleanBufferBuilder::extend_trusted_len ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah probably redundant by now

// Test small extend (less than 64 bits)
let mut builder = NullBufferBuilder::new(0);
unsafe {
builder.extend_trusted_len([true, false, true, true].iter().copied());
}
// bits: 0=true, 1=false, 2=true, 3=true -> 0b1101 = 13
assert_eq!(builder.as_slice().unwrap(), &[0b1101_u8]);

// Test extend with exactly 64 bits
let mut builder = NullBufferBuilder::new(0);
let pattern: Vec<bool> = (0..64).map(|i| i % 2 == 0).collect();
unsafe {
builder.extend_trusted_len(pattern.iter().copied());
}
// Even positions are true: 0, 2, 4, ... -> bits 0, 2, 4, ...
// In little-endian: 0b01010101 repeated
assert_eq!(
builder.as_slice().unwrap(),
&[0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55]
);

// Test extend with more than 64 bits (tests chunking)
let mut builder = NullBufferBuilder::new(0);
let pattern: Vec<bool> = (0..100).map(|i| i % 3 == 0).collect();
unsafe { builder.extend_trusted_len(pattern.iter().copied()) };
assert_eq!(builder.len(), 100);
// Verify a few specific bits
let buf = builder.finish().unwrap();
assert!(buf.is_valid(0)); // 0 % 3 == 0
assert!(!buf.is_valid(1)); // 1 % 3 != 0
assert!(!buf.is_valid(2)); // 2 % 3 != 0
assert!(buf.is_valid(3)); // 3 % 3 == 0
assert!(buf.is_valid(99)); // 99 % 3 == 0

// Test extend with non-aligned start (tests bit-by-bit path)
let mut builder = NullBufferBuilder::new(0);
builder.append_non_null(); // Start at bit 1 (non-aligned)
unsafe { builder.extend_trusted_len([false, true, false, true].iter().copied()) };
assert_eq!(builder.as_slice().unwrap(), &[0b10101_u8]);
}
}
128 changes: 122 additions & 6 deletions arrow-select/src/coalesce.rs
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,8 @@
//!
//! [`filter`]: crate::filter::filter
//! [`take`]: crate::take::take
use crate::filter::filter_record_batch;
use crate::filter::{FilterBuilder, FilterPredicate, is_optimize_beneficial_record_batch};

use crate::take::take_record_batch;
use arrow_array::types::{BinaryViewType, StringViewType};
use arrow_array::{Array, ArrayRef, BooleanArray, RecordBatch, downcast_primitive};
Expand Down Expand Up @@ -212,7 +213,10 @@ impl BatchCoalescer {
/// Push a batch into the Coalescer after applying a filter
///
/// This is semantically equivalent of calling [`Self::push_batch`]
/// with the results from [`filter_record_batch`]
/// with the results from [`filter_record_batch`], but avoids
/// materializing the intermediate filtered batch.
///
/// [`filter_record_batch`]: crate::filter::filter_record_batch
///
/// # Example
/// ```
Expand All @@ -238,10 +242,103 @@ impl BatchCoalescer {
batch: RecordBatch,
filter: &BooleanArray,
) -> Result<(), ArrowError> {
// TODO: optimize this to avoid materializing (copying the results
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

// of filter to a new batch)
let filtered_batch = filter_record_batch(&batch, filter)?;
self.push_batch(filtered_batch)
// We only support primitve now, fallback to filter_record_batch for other types
// Also, skip optimization when filter is not very selectivex§

// Build an optimized filter predicate that chooses the best iteration strategy
// Byteview does use a filter as part of calculating ideal buffer sizes, so optimizing is helpful even for
// a single array
let is_optimize_beneficial = is_optimize_beneficial_record_batch(&batch)
|| batch.columns().len() == 1
&& matches!(
batch.columns()[0].data_type(),
DataType::BinaryView | DataType::Utf8View
);
let selected_count = filter.true_count();
let num_rows = batch.num_rows();

// Fast path: skip if no rows selected
if selected_count == 0 {
return Ok(());
}

// Fast path: if all rows selected, just push the batch
if selected_count == num_rows {
return self.push_batch(batch);
}

let (_schema, arrays, _num_rows) = batch.into_parts();

let mut filter_builder = FilterBuilder::new(&filter);

if is_optimize_beneficial {
filter_builder = filter_builder.optimize();
}

let filter = filter_builder.build();
// Setup input arrays as sources
assert_eq!(arrays.len(), self.in_progress_arrays.len());
self.in_progress_arrays
.iter_mut()
.zip(arrays)
.for_each(|(in_progress, array)| {
in_progress.set_source_from_filter(Some(array), &filter);
});

// Choose iteration strategy based on the optimized predicate
self.copy_from_filter(filter, selected_count)?;
// Clear sources to allow memory to be freed
for in_progress in self.in_progress_arrays.iter_mut() {
in_progress.set_source(None);
}

Ok(())
}

/// Helper to copy rows at the given indices, handling batch boundaries efficiently
///
/// This method batches the index iteration to avoid per-row batch boundary checks.
fn copy_from_filter(
&mut self,
filter: FilterPredicate,
count: usize,
) -> Result<(), ArrowError> {
let mut remaining = count;
let mut filter_pos = 0; // Position in the filter array

// Build an optimized filter predicate once for the whole input batch

// We need to process the filter in chunks that fit the target batch size
while remaining > 0 {
let space_in_batch = self.target_batch_size - self.buffered_rows;
let to_copy = remaining.min(space_in_batch);

// Find how many filter positions we need to cover `to_copy` set bits
// Skip the expensive search if all remaining rows fit in the current batch
let chunk_len = if remaining <= space_in_batch {
filter.len() - filter_pos
} else {
filter.find_nth_set_bit_position(filter_pos, to_copy) - filter_pos
};

let chunk_predicate = filter.slice_with_count(filter_pos, chunk_len, to_copy);

// Copy all collected indices in one call per array
for in_progress in self.in_progress_arrays.iter_mut() {
in_progress.copy_rows_by_filter(&chunk_predicate, filter_pos, chunk_len)?;
}
Comment on lines +326 to +329
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A performance improvement you can do here is copy X columns at a time like I did and explained in 4:

the number 4 is a magic number, but you can pick other number like 2 to amortize the cost of boolean iterations


self.buffered_rows += to_copy;
filter_pos += chunk_len;
remaining -= to_copy;

// If we've filled the batch, finish it
if self.buffered_rows >= self.target_batch_size {
self.finish_buffered_batch()?;
}
}

Ok(())
}

/// Push a batch into the Coalescer after applying a set of indices
Expand Down Expand Up @@ -598,13 +695,31 @@ trait InProgressArray: std::fmt::Debug + Send + Sync {
/// current in-progress array
fn set_source(&mut self, source: Option<ArrayRef>);

/// Set the source array with a filter, allowing for calculating GC based on filter
///
/// Default implementation just calls [`Self::set_source`]
fn set_source_from_filter(&mut self, source: Option<ArrayRef>, _filter: &FilterPredicate) {
self.set_source(source);
}

/// Copy rows from the current source array into the in-progress array
///
/// The source array is set by [`Self::set_source`].
///
/// Return an error if the source array is not set
fn copy_rows(&mut self, offset: usize, len: usize) -> Result<(), ArrowError>;

/// Copy rows from the source array between the specified offset and len that
/// match the predicate to the output array
///
/// TODO add an example
fn copy_rows_by_filter(
&mut self,
filter: &FilterPredicate,
offset: usize,
len: usize,
) -> Result<(), ArrowError>;

/// Finish the currently in-progress array and return it as an `ArrayRef`
fn finish(&mut self) -> Result<ArrayRef, ArrowError>;
}
Expand All @@ -613,6 +728,7 @@ trait InProgressArray: std::fmt::Debug + Send + Sync {
mod tests {
use super::*;
use crate::concat::concat_batches;
use crate::filter::filter_record_batch;
use arrow_array::builder::StringViewBuilder;
use arrow_array::cast::AsArray;
use arrow_array::types::Int32Type;
Expand Down
Loading
Loading