Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] Perf regression for INSERT ON CONFLICT read batching for low rows and multiple arbiter indexes #26242

Open
1 task done
jasonyb opened this issue Feb 28, 2025 · 1 comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/enhancement This is an enhancement of an existing feature kind/perf priority/medium Medium priority issue

Comments

@jasonyb
Copy link
Contributor

jasonyb commented Feb 28, 2025

Jira Link: DB-15588

Description

INSERT ON CONFLICT read batching batch-reads all arbiter indexes then looks for conflicts. INSERT ON CONFLICT without read batching will, for each row, read each arbiter index one-by-one until it hits a conflict. Therefore, in the single-row insert case, performance will often be worse for INSERT ON CONFLICT with read batching than without.

A formula for read RPCs can be thought of as follows:

  • read batching on: ceil(num_rows / batch_size) * num_arbiter_indexes
  • read batching off: best case: num_rows * 1; worst case: num_rows * num_arbiter_indexes

This formula doesn't take into account RPC buffering and flushing. If we were to simplify to the case that num_rows = batch_size and we have the best case (first arbiter index always matches, and this is DO NOTHING):

  • read batching on:
    • reads: 1 * num_arbiter_indexes
    • flushes (actual write round-trips): 0
  • read batching off:
    • reads: num_rows * 1
    • flushes (actual write round-trips): 0

So in this best case (best as in lowest RPCs), read batching turned on has more RPCs if

num_arbiter_indexes > num_rows

If only one row is inserted (which is a common case), then if there is more than one arbiter index, read batching causes a performance degradation.

A possible solution is to change the code to behave like non-read-batching in case of a single row. However, if this is done naively, then it would skip updating the global intents map, causing another variant of caching problem similar to #26241. For example:

WITH w(i) AS ( 
    INSERT INTO with_a VALUES (11) ON CONFLICT DO NOTHING RETURNING i 
) INSERT INTO with_a VALUES (generate_series(10, 15)) ON CONFLICT (i) DO UPDATE SET i = EXCLUDED.i + (SELECT i FROM w); 
TABLE with_a;

Today, the INSERT ON CONFLICT within WITH clause updates the global intents map. If this were fixed naively, then it would no longer do that. (And, come to think of it, today, it isn't updating each individual batch-read map either...)

See also related #23115.

Issue Type

kind/enhancement

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@jasonyb jasonyb added area/ysql Yugabyte SQL (YSQL) kind/perf status/awaiting-triage Issue awaiting triage labels Feb 28, 2025
@yugabyte-ci yugabyte-ci added kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue labels Feb 28, 2025
@jasonyb
Copy link
Contributor Author

jasonyb commented Feb 28, 2025

Easy repro on 95e3c5f, almalinux8, fastdebug, gcc11 (look at postgres logs for RPCs):

set yb_debug_log_docdb_requests = on;
create table t (i int, j int, k int, l int);
create unique index nonconcurrently on t (i);
create unique index nonconcurrently on t (j);
create unique index nonconcurrently on t (k);
create unique index nonconcurrently on t (l);
insert into t values (1, 2, 3, 4);
insert into t values (1, 2, 3, 4) on conflict do nothing;
set yb_insert_on_conflict_read_batch_size = 0;
insert into t values (1, 2, 3, 4) on conflict do nothing;

@jasonyb jasonyb added 2024.2 Backport Required and removed kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue labels Feb 28, 2025
@yugabyte-ci yugabyte-ci added kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue labels Feb 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/enhancement This is an enhancement of an existing feature kind/perf priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

3 participants