feat: add backpressure to v2 I/O scheduler #2683

westonpace · 2024-08-02T22:53:50Z

Adds backpressure to the I/O scheduler. The general problem is rather tricky. If your I/O threads all pause because your I/O buffer is full and then your decode threads are waiting for queued I/O tasks then you will have a deadlock.

I/O priority should allow us to avoid this situation but it is something that will need to be monitored, especially if users want to set really small I/O buffer sizes. For this reason I haven't made any of the new settings configurable (except for deadlock detection which we may turn on for debugging purposes if people seem to have encountered a deadlock).

One way you can hit this deadlock is to create a file with 10 pages where each page has 10GB of data. Even a single read will fill up the I/O buffer. We submit the reads in priority order but we have 8 I/O threads and so they race to grab the permits.

As a result I've added a much needed option to split primitive arrays into multiple pages if we are given huge chunks of data. The splitting algorithm is not perfect and could also use some work (perhaps we can do a sort of "binary splitting" where we continuously split the largest chunk in half until all chunks are below a given size)

At the moment I think things are "safe enough" that this PR prevents more problems (avoids OOM) than it introduces (deadlock in esoteric cases).

wjones127

This seems well thought out. My only concerns will be whether some of the warnings will be noisy, but I think in at least one of the cases you've handled that well with the 60s timer.

wjones127 · 2024-08-05T16:47:09Z

rust/lance-encoding/src/decoder.rs

+                    EMITTED_BATCH_SIZE_WARNING.call_once(|| {
+                        let size_mb = size_bytes / 1024 / 1024;
+                        warn!("Lance read in a single batch that contained more than {}MiB of data.  You may want to consider reducing the batch size.", size_mb);
+                    });


Do you worry this could be noisy? Would someone with image / video datasets always hit this?

Also, does this ever reset? To try again with a smaller batch size and see if they get this warning, will they need to restart the process?

Do you worry this could be noisy? Would someone with image / video datasets always hit this?

10MiB is pretty extreme but yes, someone with a video dataset would encounter this even if there was only one video. Perhaps I will only emit the warning if the batch size is greater than 32 (just some arbitrary threshold).

Also, does this ever reset? To try again with a smaller batch size and see if they get this warning, will they need to restart the process?

Hmm, I was thinking "once per scan" but, on reflection, it seems I ended up with "once per process". I will fix.

Changed to once-per-scan.

rust/lance-io/src/scheduler.rs

…eduler loop. This ensures that backpressure is acquired in priority order which should help avoid deadlock in the very large range case

…-per-process

github-actions bot added enhancement New feature or request python labels Aug 2, 2024

westonpace requested review from wjones127 and eddyxu August 2, 2024 22:55

westonpace force-pushed the feat/v2-backpressure branch from 1867eef to 1e6bef9 Compare August 5, 2024 16:31

wjones127 approved these changes Aug 5, 2024

View reviewed changes

westonpace force-pushed the feat/v2-backpressure branch 2 times, most recently from fbc8d0c to 56136cf Compare August 5, 2024 20:12

westonpace added 6 commits August 5, 2024 13:37

Add backpressure to I/O scheduler

2b09620

Moved backpressure acquisition from inside I/O task to inside the sch…

1769170

…eduler loop. This ensures that backpressure is acquired in priority order which should help avoid deadlock in the very large range case

Apply clippy suggestions

c9524bf

Change batch size warning to be emitted once-per-scan instead of once…

c978101

…-per-process

Applying suggestion from code review

a3f470d

Add special case for arrays of size 0

6651413

westonpace force-pushed the feat/v2-backpressure branch from 56136cf to 6651413 Compare August 5, 2024 20:45

westonpace merged commit 508c1a1 into lancedb:main Aug 5, 2024
20 of 22 checks passed

westonpace mentioned this pull request Sep 11, 2024

Add backpressure to decoder scheduler #1957

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add backpressure to v2 I/O scheduler #2683

feat: add backpressure to v2 I/O scheduler #2683

westonpace commented Aug 2, 2024

wjones127 left a comment

wjones127 Aug 5, 2024

westonpace Aug 5, 2024

westonpace Aug 5, 2024

feat: add backpressure to v2 I/O scheduler #2683

feat: add backpressure to v2 I/O scheduler #2683

Conversation

westonpace commented Aug 2, 2024

wjones127 left a comment

Choose a reason for hiding this comment

wjones127 Aug 5, 2024

Choose a reason for hiding this comment

westonpace Aug 5, 2024

Choose a reason for hiding this comment

westonpace Aug 5, 2024

Choose a reason for hiding this comment