perf: coalesce ids before executing take #2680

westonpace · 2024-08-02T14:16:58Z

Late materialization is a great benefit when executing a highly selective filter. However, if a filter is highly selective it means that each input batch will probably only have a few matching rows. The current implementation executes take for each filtered batch. E.g. instead of a single call of take(500, 10000, 300000) we get three calls take(500), take(10000), and take(300000). This means:

We can't coalesce
More CPU overhead (many calls to take_ranges)
Very small output batches (user's batch size is not respected)

On cloud storage I see a 10x plus benefit in scan performance.

We have a benchmark for this (EDA search plot 4) which should assist with preventing regression in the future: https://bencher.dev/console/projects/weston-lancedb/plots

eddyxu · 2024-08-02T15:17:29Z

rust/lance/src/dataset/scanner.rs

@@ -1584,9 +1585,10 @@ impl Scanner {
 projection: &Schema,
 batch_readahead: usize,
 ) -> Result<Arc<dyn ExecutionPlan>> {
+ let coalesced = Arc::new(CoalesceBatchesExec::new(input, self.get_batch_size()));


Do we want to sort row IDs to offer a better chance we can do sequential reads?

We already do that internally.

codecov-commenter · 2024-08-05T14:26:25Z

Codecov Report

Attention: Patch coverage is 80.95238% with 4 lines in your changes missing coverage. Please review.

Project coverage is 79.34%. Comparing base (30b3df7) to head (08cd611).

Files	Patch %	Lines
rust/lance-encoding/src/decoder.rs	20.00%	4 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2680      +/-   ##
==========================================
+ Coverage   79.32%   79.34%   +0.01%     
==========================================
  Files         226      226              
  Lines       66872    66886      +14     
  Branches    66872    66886      +14     
==========================================
+ Hits        53049    53069      +20     
- Misses      10720    10724       +4     
+ Partials     3103     3093      -10

Flag	Coverage Δ
unittests	`79.34% <80.95%> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

…the number of times we call take which has a number of performance benefits.

github-actions bot added python performance labels Aug 2, 2024

westonpace requested review from wjones127 and eddyxu August 2, 2024 14:17

eddyxu reviewed Aug 2, 2024

View reviewed changes

wjones127 approved these changes Aug 2, 2024

View reviewed changes

westonpace force-pushed the perf/coalesce-take-ids branch from fdd4562 to e69f0fd Compare August 5, 2024 14:05

westonpace added 2 commits August 13, 2024 10:16

Coalesce ids before executing take when using TakeExec. This reduces …

89aafc4

…the number of times we call take which has a number of performance benefits.

Fix tests that were expecting old behavior

08cd611

westonpace force-pushed the perf/coalesce-take-ids branch from e69f0fd to 08cd611 Compare August 13, 2024 17:16

Fix after rebase

ca92427

westonpace merged commit 711bad7 into lancedb:main Aug 13, 2024
22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: coalesce ids before executing take #2680

perf: coalesce ids before executing take #2680

westonpace commented Aug 2, 2024

eddyxu Aug 2, 2024

wjones127 Aug 2, 2024

codecov-commenter commented Aug 5, 2024 •

edited

Loading

perf: coalesce ids before executing take #2680

perf: coalesce ids before executing take #2680

Conversation

westonpace commented Aug 2, 2024

eddyxu Aug 2, 2024

Choose a reason for hiding this comment

wjones127 Aug 2, 2024

Choose a reason for hiding this comment

codecov-commenter commented Aug 5, 2024 • edited Loading

Codecov Report

codecov-commenter commented Aug 5, 2024 •

edited

Loading