[YSQL] Perf regression for INSERT ON CONFLICT read batching for low rows and multiple arbiter indexes #26242
Labels
area/ysql
Yugabyte SQL (YSQL)
kind/enhancement
This is an enhancement of an existing feature
kind/perf
priority/medium
Medium priority issue
Jira Link: DB-15588
Description
INSERT ON CONFLICT read batching batch-reads all arbiter indexes then looks for conflicts. INSERT ON CONFLICT without read batching will, for each row, read each arbiter index one-by-one until it hits a conflict. Therefore, in the single-row insert case, performance will often be worse for INSERT ON CONFLICT with read batching than without.
A formula for read RPCs can be thought of as follows:
ceil(num_rows / batch_size) * num_arbiter_indexes
num_rows * 1
; worst case:num_rows * num_arbiter_indexes
This formula doesn't take into account RPC buffering and flushing. If we were to simplify to the case that num_rows = batch_size and we have the best case (first arbiter index always matches, and this is DO NOTHING):
1 * num_arbiter_indexes
0
num_rows * 1
0
So in this best case (best as in lowest RPCs), read batching turned on has more RPCs if
If only one row is inserted (which is a common case), then if there is more than one arbiter index, read batching causes a performance degradation.
A possible solution is to change the code to behave like non-read-batching in case of a single row. However, if this is done naively, then it would skip updating the global intents map, causing another variant of caching problem similar to #26241. For example:
Today, the INSERT ON CONFLICT within WITH clause updates the global intents map. If this were fixed naively, then it would no longer do that. (And, come to think of it, today, it isn't updating each individual batch-read map either...)
See also related #23115.
Issue Type
kind/enhancement
Warning: Please confirm that this issue does not contain any sensitive information
The text was updated successfully, but these errors were encountered: