-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add in-batch coalescing #1959
Comments
It's possible the above definition is too simple. It may be possible to coalesce "too much". E.g. in Arrow there is both a "coalesce if D < this many bytes" parameter and a "don't coalesce if the combined IOP is greater than Y bytes" parameter. However, for in-batch coalescing, I don't think this will be a problem. Batches correspond to pages and pages should not be overly large. So even if we end up coalescing the entire batch into a single request it should not be too large. |
Should fix #2629, addresses #1959 1. Earlier on randomly accessing rows, each request for a row was being scheduled separately, which increased overhead, especially on large datasets. This PR coalesces take scheduling when requests are within `block_size` distance from each other. The block size is determined based on the system. 2. The binary scheduler was also scheduling decoding of all indices individually. This updates the binary scheduler so that it schedules all offsets at once. These are then processed to determine which bytes to decode like before. 3. A script we can use to compare v1 vs v2 performance is added as `test_random_access.py`. Specifically, on the lineitem dataset (same file from the issue above): - v1 query time: `0.12s` - v2 query time (before): `2.8s`. - v2 query time (after (1)): `0.54s`. - v2 query time (after (1) and (2)): `0.02s`.
Closed by #2636 |
The I/O scheduler receives batches of IOPS. For example, a take operation, on a value encoded page, will yield a single batch with one IOP per index. We should coalesce these requests. We can sort the IOPS but I think it might be cheaper if schedulers can be expected to generate ordered sequences of IOPS.
The theory of coalescing is basically that filesystems have a "cost per byte" and a "cost per iop". If two requests come in with size S1 and S2 and the # of bytes between the end of S1 and the start of S2 is D bytes then we should coalesce if
D < cost_iop / cost_byte
.This means coalescing requires a single configuration parameter which we can call something like coalesce aggressiveness which is
cost_iop / cost_byte
. We should investigate a reasonable value for this parameter on the various filesystems and use that as a default (I'm not sure we should allow this to be user facing at all, at least not now).The text was updated successfully, but these errors were encountered: