-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: improve precision of index spans to scan for multi-column-family single-row fetches #18168
Comments
I now suspect that this is the reason why multiple column families didn't provide a noticeable improvement in YCSB workload A. If an UPDATE statement only needed to scan the primary key and the columns that it was updating, then splitting every column in YCSB into a separate column family could reduce contention by 10x. I think this means we could scale to 10 times the throughput in the zipfian distribution. |
The code that needs to be updated is One complication is that |
I've filed #30656 to track similar improvements to be done even when col fams are not used (and all columns are part of a single col fam). |
Regarding the complication @jordanlewis mentioned, I'm thinking for a first pass we can just handle the case where only a single column family is needed. Does that seem reasonable, and is it sufficient for TPCC? |
Unfortunately I don't think it's sufficient for TPCC, which would have tables that need 3 columns families total - one with the mutable columns from newOrder, one with the immutable columns, and one with the mutable columns from payment. |
Actually, I'm not entirely sure if the row fetcher would need to itself emit spans over more than 1 column family, though. @nvanbenschoten I think you thought this through in more detail, didn't you? |
Yes, TPC-C and YCSB will both require us to emit spans over more than 1 column family for the proposed optimization to work correctly. In both cases, there will be a static column family that all requests will read and a set of dynamic column families, out of which any given request will only read and update one. |
But we're talking about an entire transaction, right? The row fetcher only has to do tight spans for each query in isolation, and it looks like each individual read-only query in In new order,
In payment,
And the rest of the queries are either mutations or queries on tables that aren't shared across those 2 txns. So actually I think it's fine to only do this special case for 1 CF for now, and we'll still get a win for TPCC. |
The problem isn't just the read-only queries, it's also the read-write queries. If one of the UPDATE statements touches a column family that it doesn't need to read or update, then we've already lost. For instance, |
Oh and FWIW, this issue touches the following TODO: Lines 60 to 66 in 6c5d9a0
|
For tables with multiple column families, point lookups will now only scan column families which contain the needed columns. Previously we would scan the entire row. This optimization allows for faster lookups and, perhaps more importantly, reduces contention between operations on the same row but disjoint column families. Fixes cockroachdb#18168 Release note: None
Unfortunately, my last post which I deleted was premature. I realized shortly after that we weren't correctly scanning at least one non-nullable column family when fetching the rows to update, so we weren't actually performing any updates at all because all fields in the dataset start as NULL. In other words, this is what the plan looked like.
With that issue fixed, the plan correctly looks like:
I ran a modified YCSB workload again and the theory still panned out, but it certainly wasn't the 5x throughput improvement we saw in the other test. This is YCSB's workload A with a zipfian load distribution. I ran it on a three-node GCE cluster with n1-highcpu-16 machines and the nobarrier SSD mount option. The concurrency started at 1 thread and doubled every minute up to 1024 threads. We can see that the peak throughput is ~50% higher with column families than without. In one sence, this is disapointing compared to the crazy speedup we saw before. In another, 50% is still an awesome win and clearly justifies the urgency of this change! It also makes me wonder whether contention is still the primary bottleneck in YCSB. I also ran a second test where I used the old broken fix for this issue but adjusted the YCSB schema to use non-nullable columns so that we wouldn't miss rows. That resulted in very similar performance to what we see here. That indicates that the extra span in the point lookup has a negligible cost, or at least a cost that is offset by the extra data size required for non-nullable columns in this schema. |
It is also interesting that these new measurements hit a scalability cliff at the same place as the test that wasn't performing any updates at all. That lends further credence to the idea that there's a new dominant bottleneck that's replaced contention once we reach a concurrency of around 64 threads. |
I wonder if we're hitting the same concurrency cliff that we saw in #26178. |
For tables with multiple column families, point lookups will now only scan column families which contain the needed columns. Previously we would scan the entire row. This optimization allows for faster lookups and, perhaps more importantly, reduces contention between operations on the same row but disjoint column families. Fixes cockroachdb#18168 Release note: None
For tables with multiple column families, point lookups will now only scan column families which contain the needed columns. Previously we would scan the entire row. This optimization allows for faster lookups and, perhaps more importantly, reduces contention between operations on the same row but disjoint column families. Fixes cockroachdb#18168 Release note: None
For tables with multiple column families, point lookups will now only scan column families which contain the needed columns. Previously we would scan the entire row. This optimization allows for faster lookups and, perhaps more importantly, reduces contention between operations on the same row but disjoint column families. Fixes cockroachdb#18168 Release note: None
For tables with multiple column families, point lookups will now only scan column families which contain the needed columns. Previously we would scan the entire row. This optimization allows for faster lookups and, perhaps more importantly, reduces contention between operations on the same row but disjoint column families. Fixes cockroachdb#18168 Release note: None
30744: sql: optimize point lookups on column families r=solongordon a=solongordon For tables with multiple column families, point lookups will now only scan column families which contain the needed columns. Previously we would scan the entire row. This optimization allows for faster lookups and, perhaps more importantly, reduces contention between operations on the same row but disjoint column families. Fixes #18168 Release note: None Co-authored-by: Solon Gordon <solon@cockroachlabs.com>
This change adds a new `--families` flag to the ycsb workload. Now that cockroachdb#18168 is addressed, this significantly reduces the contention present in the workload by avoiding conflicts on updates to different columns in the same table. Release note: None
32704: workload/ycsb: add flag to use column families r=nvanbenschoten a=nvanbenschoten This change adds a new `--families` flag to the ycsb workload. Now that #18168 is addressed, this significantly reduces the contention present in the workload by avoiding conflicts on updates to different columns in the same table. I just confirmed that this still provides a huge speedup. On a 24 cpu machine: ``` workload run ycsb --init --workload='A' --concurrency=128 --duration=1m --families=false gives: _elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total 60.0s 0 103089 1718.0 13.9 0.6 1.8 604.0 1476.4 read 60.0s 0 102947 1715.6 59.5 5.2 11.5 2281.7 8321.5 update _elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result 60.0s 0 206036 3433.6 36.7 3.3 8.9 1342.2 8321.5 --families=true gives: _elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total 60.0s 0 333477 5557.8 9.2 0.6 6.0 302.0 1275.1 read 60.0s 0 332366 5539.3 13.7 6.8 17.8 54.5 4831.8 update _elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result 60.0s 0 665843 11097.1 11.5 3.9 16.3 268.4 4831.8 ``` cc. @robert-s-lee @drewdeally Release note: None Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Suppose we have a table and a query:
Currently, this query will cause a scan from
/a/primary/1 -> /a/primary/2
, even though ideally we'd only need to scan the key at/a/primary/1/b
.We have enough information to improve this scan, since we know that
c
is not a needed column,a
is the primary index and therefore unique, and we have an equality constraint ona
.We should consider improving the algorithm in
spanFromLogicalSpan
to incorporate all of this information into its final output span.cc @danhhz @andreimatei @petermattis based on our earlier conversation.
The text was updated successfully, but these errors were encountered: