Skip to content

Commit

Permalink
sql: support range based lookup join spans
Browse files Browse the repository at this point in the history
Informs #51576

If filters exist on a lookup join that match columns that we are doing
the lookup against add them to the lookupExpr in the join reader spec
and build those filters into the multispan generator.

If we have inequality conditions we need to be able to lookup of the
prefix for keys found against range spans and not just point spans so
build a sorted slice of span+inputRowIndices we can binary search on.

Issue #51576 also encompasses allowing inequalities on columns from the
index to reference columns from the input, that will come in a later
commit.

Release note (sql change): Improve performance of lookup joins in some
cases. If join inequality conditions can be matched to index columns
include the conditions in the index lookup spans and remove them from
the runtime filters.
  • Loading branch information
cucaroach committed Jul 16, 2021
1 parent 9ddc831 commit 2943ec0
Show file tree
Hide file tree
Showing 23 changed files with 1,786 additions and 158 deletions.
27 changes: 18 additions & 9 deletions pkg/ccl/logictestccl/testdata/logic_test/regional_by_row
Original file line number Diff line number Diff line change
Expand Up @@ -1652,6 +1652,9 @@ ALTER TABLE regional_by_row_table ADD CONSTRAINT unique_b_a UNIQUE(b, a)

# We should plan uniqueness checks for all unique indexes in
# REGIONAL BY ROW tables.
# TODO(treilly): The constraint check for uniq_idx should use uniq_idx but due
# to stats issues w/ empty stats, partial indexes and multicol stats its not.
# Hopefully fixing #67583 (and possibly #67479) will resolve this.
query T
SELECT * FROM [EXPLAIN INSERT INTO regional_by_row_table (pk, pk2, a, b) VALUES (1, 1, 1, 1)] OFFSET 2
----
Expand Down Expand Up @@ -1698,9 +1701,9 @@ SELECT * FROM [EXPLAIN INSERT INTO regional_by_row_table (pk, pk2, a, b) VALUES
│ └── • error if rows
│ │
│ └── • lookup join (semi)
│ │ table: regional_by_row_table@uniq_idx (partial index)
│ │ lookup condition: (column3 = a) AND (crdb_region = 'ap-southeast-2')
│ │ remote lookup condition: (column3 = a) AND (crdb_region IN ('ca-central-1', 'us-east-1'))
│ │ table: regional_by_row_table@new_idx
│ │ lookup condition: ((column3 = a) AND (crdb_region = 'ap-southeast-2')) AND (b > 0)
│ │ remote lookup condition: ((column3 = a) AND (crdb_region IN ('ca-central-1', 'us-east-1'))) AND (b > 0)
│ │ pred: (column1 != pk) OR (crdb_region_default != crdb_region)
│ │
│ └── • filter
Expand Down Expand Up @@ -1728,6 +1731,9 @@ INSERT INTO regional_by_row_table (crdb_region, pk, pk2, a, b) VALUES ('us-east-

# The conflict columns in an upsert should only include the primary key,
# not the region column.
# TODO(treilly): The constraint check for uniq_idx should use uniq_idx but due
# to stats issues w/ empty stats, partial indexes and multicol stats its not.
# Hopefully fixing #67583 (and possibly #67479) will resolve this.
query T
SELECT * FROM [EXPLAIN UPSERT INTO regional_by_row_table (crdb_region, pk, pk2, a, b) VALUES ('us-east-1', 2, 3, 2, 3)] OFFSET 2
----
Expand Down Expand Up @@ -1779,9 +1785,9 @@ SELECT * FROM [EXPLAIN UPSERT INTO regional_by_row_table (crdb_region, pk, pk2,
│ └── • error if rows
│ │
│ └── • lookup join (semi)
│ │ table: regional_by_row_table@uniq_idx (partial index)
│ │ lookup condition: (column4 = a) AND (crdb_region = 'ap-southeast-2')
│ │ remote lookup condition: (column4 = a) AND (crdb_region IN ('ca-central-1', 'us-east-1'))
│ │ table: regional_by_row_table@new_idx
│ │ lookup condition: ((column4 = a) AND (crdb_region = 'ap-southeast-2')) AND (b > 0)
│ │ remote lookup condition: ((column4 = a) AND (crdb_region IN ('ca-central-1', 'us-east-1'))) AND (b > 0)
│ │ pred: (upsert_pk != pk) OR (column1 != crdb_region)
│ │
│ └── • filter
Expand All @@ -1803,6 +1809,9 @@ SELECT * FROM [EXPLAIN UPSERT INTO regional_by_row_table (crdb_region, pk, pk2,
└── • scan buffer
label: buffer 1

# TODO(treilly): The constraint check for uniq_idx should use uniq_idx but due
# to stats issues w/ empty stats, partial indexes and multicol stats its not.
# Hopefully fixing #67583 (and possibly #67479) will resolve this.
query T
SELECT * FROM [EXPLAIN UPSERT INTO regional_by_row_table (crdb_region, pk, pk2, a, b)
VALUES ('us-east-1', 23, 24, 25, 26), ('ca-central-1', 30, 30, 31, 32)] OFFSET 2
Expand Down Expand Up @@ -1850,9 +1859,9 @@ VALUES ('us-east-1', 23, 24, 25, 26), ('ca-central-1', 30, 30, 31, 32)] OFFSET 2
│ └── • error if rows
│ │
│ └── • lookup join (semi)
│ │ table: regional_by_row_table@uniq_idx (partial index)
│ │ lookup condition: (column4 = a) AND (crdb_region = 'ap-southeast-2')
│ │ remote lookup condition: (column4 = a) AND (crdb_region IN ('ca-central-1', 'us-east-1'))
│ │ table: regional_by_row_table@new_idx
│ │ lookup condition: ((column4 = a) AND (crdb_region = 'ap-southeast-2')) AND (b > 0)
│ │ remote lookup condition: ((column4 = a) AND (crdb_region IN ('ca-central-1', 'us-east-1'))) AND (b > 0)
│ │ pred: (upsert_pk != pk) OR (column1 != crdb_region)
│ │
│ └── • filter
Expand Down
12 changes: 12 additions & 0 deletions pkg/roachpb/data.go
Original file line number Diff line number Diff line change
Expand Up @@ -2244,6 +2244,18 @@ func (s Span) ContainsKey(key Key) bool {
return bytes.Compare(key, s.Key) >= 0 && bytes.Compare(key, s.EndKey) < 0
}

// CompareKey returns -1 if the key precedes the span start, 0 if its contained
// by the span and 1 if its after the end of the span.
func (s Span) CompareKey(key Key) int {
if bytes.Compare(key, s.Key) >= 0 {
if bytes.Compare(key, s.EndKey) < 0 {
return 0
}
return 1
}
return -1
}

// ProperlyContainsKey returns whether the span properly contains the given key.
func (s Span) ProperlyContainsKey(key Key) bool {
return bytes.Compare(key, s.Key) > 0 && bytes.Compare(key, s.EndKey) < 0
Expand Down
5 changes: 3 additions & 2 deletions pkg/sql/execinfrapb/processors_sql.pb.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 3 additions & 2 deletions pkg/sql/execinfrapb/processors_sql.proto
Original file line number Diff line number Diff line change
Expand Up @@ -280,12 +280,13 @@ message JoinReaderSpec {
// more complicated than a simple equality between input columns and index
// columns. In this case, LookupExpr specifies the expression that will be
// used to construct the spans for each lookup. Currently, the only
// expressions supported are conjunctions (AND expressions) of equality and
// IN expressions, specifically:
// expressions supported are conjunctions (AND expressions) of equality, IN
// expressions, and simple inequalities, specifically:
// 1. equalities between two variables (one from the input and one from the
// index) representing the equi-join condition(s),
// 2. equalities between an index column and a constant, and
// 3. IN expressions between an index column and a tuple of constants.
// 4. LT,GT,GE,LE between an index var and a constant.
//
// Variables in this expression are assigned in the same way as the ON
// condition below. Assuming that the left stream has N columns and the right
Expand Down
3 changes: 1 addition & 2 deletions pkg/sql/opt/exec/execbuilder/testdata/lookup_join
Original file line number Diff line number Diff line change
Expand Up @@ -75,8 +75,7 @@ vectorized: true
│ columns: (a, b, c, d, e, f)
│ estimated row count: 33
│ table: def@primary
│ equality: (b) = (f)
│ pred: e > 1
│ lookup condition: (f = b) AND (e > 1)
└── • scan
columns: (a, b, c)
Expand Down
Loading

0 comments on commit 2943ec0

Please sign in to comment.