-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt: large IN filter causes stack overflow in CombineComputedColFilters #132669
Labels
branch-master
Failures and bugs on the master branch.
branch-release-22.2
Used to mark GA and release blockers, technical advisories, and bugs for 22.2
branch-release-23.1
Used to mark GA and release blockers, technical advisories, and bugs for 23.1
branch-release-23.2
Used to mark GA and release blockers, technical advisories, and bugs for 23.2
branch-release-24.1
Used to mark GA and release blockers, technical advisories, and bugs for 24.1
branch-release-24.2
Used to mark GA and release blockers, technical advisories, and bugs for 24.2
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
S-2-temp-unavailability
Temp crashes or other availability problems. Can be worked around or resolved by restarting.
T-sql-queries
SQL Queries Team
Comments
mgartner
added
the
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
label
Oct 15, 2024
Hi @mgartner, please add branch-* labels to identify which branch(es) this C-bug affects. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
mgartner
added a commit
to mgartner/cockroach
that referenced
this issue
Oct 15, 2024
Previously, a filter of the form `col IN (elem0, elem1, ..., elemN)` could result in a stack overflow in the optimizer when `N` is very large, e.g., 1.6 million or more . The problem would occur when trying to derive constraints for indexes on computed column, specifically when those computed columns are dependent on the column in the filter, e.g., `col` in the example. The current implementation builds an `OrExpr` tree with depth equal to the length of the `IN` list (see the TODO in `makeSpansForExpr` suggesting that this should be avoided). It then constructs a `FiltersItem` with the `OrExpr` tree as the condition, causing logical properties to be built for the expression. When building logical properties, the `OrExpr` tree is traversed recursively, which causes the stack to overflow when the tree is very deep. Now, a `FiltersItems`, and hence, logical properties are not built for the `OrExpr` tree—they are not necessary—which avoids this particular type of stack overflow. Fixes cockroachdb#132669 Release note (bug fix): A bug in the query optimizer which could cause CockroachDB nodes to crash in rare cases has been fixed. The bug could occur when a query contains a filter in the form `col IN (elem0, elem1, ..., elemN)` only when `N` is very large, e.g., 1.6+ million, and when `col` exists in a hash-sharded index or exists a table with an indexed, computed column dependent on `col`.
mgartner
added a commit
to mgartner/cockroach
that referenced
this issue
Oct 15, 2024
Previously, a filter of the form `col IN (elem0, elem1, ..., elemN)` could result in a stack overflow in the optimizer when `N` is very large, e.g., 1.6 million or more . The problem would occur when trying to derive constraints for indexes on computed column, specifically when those computed columns are dependent on the column in the filter, e.g., `col` in the example. The current implementation builds an `OrExpr` tree with depth equal to the length of the `IN` list (see the TODO in `makeSpansForExpr` suggesting that this should be avoided). It then constructs a `FiltersItem` with the `OrExpr` tree as the condition, causing logical properties to be built for the expression. When building logical properties, the `OrExpr` tree is traversed recursively, which causes the stack to overflow when the tree is very deep. Now, a `FiltersItems` is never constructed, thus logical properties are not built for the `OrExpr` tree, avoiding this particular type of stack overflow. Fixes cockroachdb#132669 Release note (bug fix): A bug in the query optimizer which could cause CockroachDB nodes to crash in rare cases has been fixed. The bug could occur when a query contains a filter in the form `col IN (elem0, elem1, ..., elemN)` only when `N` is very large, e.g., 1.6+ million, and when `col` exists in a hash-sharded index or exists a table with an indexed, computed column dependent on `col`.
mgartner
added a commit
to mgartner/cockroach
that referenced
this issue
Oct 15, 2024
Previously, a filter of the form `col IN (elem0, elem1, ..., elemN)` could result in a stack overflow in the optimizer when `N` is very large, e.g., 1.6 million or more . The problem would occur when trying to derive constraints for indexes on computed column, specifically when those computed columns are dependent on the column in the filter, e.g., `col` in the example. The current implementation builds an `OrExpr` tree with depth equal to the length of the `IN` list (see the TODO in `makeSpansForExpr` suggesting that this should be avoided). It then constructs a `FiltersItem` with the `OrExpr` tree as the condition, causing logical properties to be built for the expression. When building logical properties, the `OrExpr` tree is traversed recursively, which causes the stack to overflow when the tree is very deep. Now, a `FiltersItems` is never constructed, thus logical properties are not built for the `OrExpr` tree, avoiding this particular type of stack overflow. Fixes cockroachdb#132669 Release note (bug fix): A bug in the query optimizer which could cause CockroachDB nodes to crash in rare cases has been fixed. The bug could occur when a query contains a filter in the form `col IN (elem0, elem1, ..., elemN)` only when `N` is very large, e.g., 1.6+ million, and when `col` exists in a hash-sharded index or exists a table with an indexed, computed column dependent on `col`.
mgartner
added a commit
to mgartner/cockroach
that referenced
this issue
Oct 15, 2024
Previously, a filter of the form `col IN (elem0, elem1, ..., elemN)` could result in a stack overflow in the optimizer when `N` is very large, e.g., 1.6 million or more . The problem would occur when trying to derive constraints for indexes on computed column, specifically when those computed columns are dependent on the column in the filter, e.g., `col` in the example. The current implementation builds an `OrExpr` tree with depth equal to the length of the `IN` list (see the TODO in `makeSpansForExpr` suggesting that this should be avoided). It then constructs a `FiltersItem` with the `OrExpr` tree as the condition, causing logical properties to be built for the expression. When building logical properties, the `OrExpr` tree is traversed recursively, which causes the stack to overflow when the tree is very deep. Now, a `FiltersItems` is never constructed, thus logical properties are not built for the `OrExpr` tree, avoiding this particular type of stack overflow. Fixes cockroachdb#132669 Release note (bug fix): A bug in the query optimizer which could cause CockroachDB nodes to crash in rare cases has been fixed. The bug could occur when a query contains a filter in the form `col IN (elem0, elem1, ..., elemN)` only when `N` is very large, e.g., 1.6+ million, and when `col` exists in a hash-sharded index or exists a table with an indexed, computed column dependent on `col`.
mgartner
added
branch-master
Failures and bugs on the master branch.
branch-release-23.1
Used to mark GA and release blockers, technical advisories, and bugs for 23.1
branch-release-24.1
Used to mark GA and release blockers, technical advisories, and bugs for 24.1
branch-release-23.2
Used to mark GA and release blockers, technical advisories, and bugs for 23.2
branch-release-24.2
Used to mark GA and release blockers, technical advisories, and bugs for 24.2
labels
Oct 15, 2024
This only reproduces on v22.2.x and v23.1.x if the session setting |
mgartner
added
the
branch-release-22.2
Used to mark GA and release blockers, technical advisories, and bugs for 22.2
label
Oct 15, 2024
mgartner
added a commit
to mgartner/cockroach
that referenced
this issue
Oct 15, 2024
Previously, a filter of the form `col IN (elem0, elem1, ..., elemN)` could result in a stack overflow in the optimizer when `N` is very large, e.g., 1.6 million or more . The problem would occur when trying to derive constraints for indexes on computed column, specifically when those computed columns are dependent on the column in the filter, e.g., `col` in the example. The current implementation builds an `OrExpr` tree with depth equal to the length of the `IN` list (see the TODO in `makeSpansForExpr` suggesting that this should be avoided). It then constructs a `FiltersItem` with the `OrExpr` tree as the condition, causing logical properties to be built for the expression. When building logical properties, the `OrExpr` tree is traversed recursively, which causes the stack to overflow when the tree is very deep. Now, a `FiltersItems` is never constructed, thus logical properties are not built for the `OrExpr` tree, avoiding this particular type of stack overflow. Fixes cockroachdb#132669 Release note (bug fix): A bug in the query optimizer which could cause CockroachDB nodes to crash in rare cases has been fixed. The bug could occur when a query contains a filter in the form `col IN (elem0, elem1, ..., elemN)` only when `N` is very large, e.g., 1.6+ million, and when `col` exists in a hash-sharded index or exists a table with an indexed, computed column dependent on `col`.
mgartner
added
the
S-2-temp-unavailability
Temp crashes or other availability problems. Can be worked around or resolved by restarting.
label
Oct 15, 2024
mgartner
added a commit
to mgartner/cockroach
that referenced
this issue
Oct 16, 2024
Previously, a filter of the form `col IN (elem0, elem1, ..., elemN)` could result in a stack overflow in the optimizer when `N` is very large, e.g., 1.6 million or more . The problem would occur when trying to derive constraints for indexes on computed column, specifically when those computed columns are dependent on the column in the filter, e.g., `col` in the example. The current implementation builds an `OrExpr` tree with depth equal to the length of the `IN` list (see the TODO in `makeSpansForExpr` suggesting that this should be avoided). It then constructs a `FiltersItem` with the `OrExpr` tree as the condition, causing logical properties to be built for the expression. When building logical properties, the `OrExpr` tree is traversed recursively, which causes the stack to overflow when the tree is very deep. Now, an `OrExpr` tree is never built. Instead, `CombineComputedColFilters` returns a slice of `ScalarExpr`s that represents a disjunction of expressions. This effectively eliminates the possibility of stack overflows in code that recursively traverses the derived expressions. It also simplified the logic. Fixes cockroachdb#132669 Release note (bug fix): A bug in the query optimizer which could cause CockroachDB nodes to crash in rare cases has been fixed. The bug could occur when a query contains a filter in the form `col IN (elem0, elem1, ..., elemN)` only when `N` is very large, e.g., 1.6+ million, and when `col` exists in a hash-sharded index or exists a table with an indexed, computed column dependent on `col`.
craig bot
pushed a commit
that referenced
this issue
Oct 17, 2024
132678: schemachanger: disable dsc zone config by default r=rafiss a=annrpom This patch disables zone config statements in the declarative schema changer by default. Release note: None Epic: None 132701: opt: prevent stack overflow when deriving computed column filters r=mgartner a=mgartner #### opt: add max-stack opttest option This commit adds the `max-stack` option for opttests, allowing tests to alter the maximum stack size for the goroutine that optimizes the query. Release note: None #### opt: prevent stack overflow when deriving computed column filters Previously, a filter of the form `col IN (elem0, elem1, ..., elemN)` could result in a stack overflow in the optimizer when `N` is very large, e.g., 1.6 million or more . The problem would occur when trying to derive constraints for indexes on computed column, specifically when those computed columns are dependent on the column in the filter, e.g., `col` in the example. The current implementation builds an `OrExpr` tree with depth equal to the length of the `IN` list. It then constructs a `FiltersItem` with the `OrExpr` tree as the condition, causing logical properties to be built for the expression. When building logical properties, the `OrExpr` tree is traversed recursively, which causes the stack to overflow when the tree is very deep. Now, an `OrExpr` tree is never built. Instead, `CombineComputedColFilters` returns a slice of `ScalarExpr`s that represents a disjunction of expressions. This effectively eliminates the possibility of stack overflows in code that recursively traverses the derived expressions. It also simplified the logic. Fixes #132669 Release note (bug fix): A bug in the query optimizer which could cause CockroachDB nodes to crash in rare cases has been fixed. The bug could occur when a query contains a filter in the form `col IN (elem0, elem1, ..., elemN)` only when `N` is very large, e.g., 1.6+ million, and when `col` exists in a hash-sharded index or exists a table with an indexed, computed column dependent on `col`. 132849: go.mod: bump Pebble to c6b097e7e6a0 r=dt a=jbowens Changes: * [`c6b097e7`](cockroachdb/pebble@c6b097e7) sstable: add jemalloc size classes to tests * [`b07f0b23`](cockroachdb/pebble@b07f0b23) db: fix Options.Parse handling of unrecognized comparer * [`9f164613`](cockroachdb/pebble@9f164613) db: fix a few instances of the wrong table format * [`dd26bf0e`](cockroachdb/pebble@dd26bf0e) db: delete compactions output in case of late cancellation * [`1ed55247`](cockroachdb/pebble@1ed55247) block: rewrite block flush logic * [`887598ae`](cockroachdb/pebble@887598ae) sstable: set w.indexBlockSize in colBlkWriter * [`030afd26`](cockroachdb/pebble@030afd26) colblk: use UnsafeOffsets.At2 in another instance * [`b3993ff6`](cockroachdb/pebble@b3993ff6) db: persist Experimental.EnableColumnarBlocks in OPTIONS file * [`c719c4e5`](cockroachdb/pebble@c719c4e5) db: support multiple KeySchemas Release note: none. Epic: none. Co-authored-by: Annie Pompa <annie@cockroachlabs.com> Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com> Co-authored-by: Jackson Owens <jackson@cockroachlabs.com>
blathers-crl bot
pushed a commit
that referenced
this issue
Oct 17, 2024
Previously, a filter of the form `col IN (elem0, elem1, ..., elemN)` could result in a stack overflow in the optimizer when `N` is very large, e.g., 1.6 million or more . The problem would occur when trying to derive constraints for indexes on computed column, specifically when those computed columns are dependent on the column in the filter, e.g., `col` in the example. The current implementation builds an `OrExpr` tree with depth equal to the length of the `IN` list. It then constructs a `FiltersItem` with the `OrExpr` tree as the condition, causing logical properties to be built for the expression. When building logical properties, the `OrExpr` tree is traversed recursively, which causes the stack to overflow when the tree is very deep. Now, an `OrExpr` tree is never built. Instead, `CombineComputedColFilters` returns a slice of `ScalarExpr`s that represents a disjunction of expressions. This effectively eliminates the possibility of stack overflows in code that recursively traverses the derived expressions. It also simplified the logic. Fixes #132669 Release note (bug fix): A bug in the query optimizer which could cause CockroachDB nodes to crash in rare cases has been fixed. The bug could occur when a query contains a filter in the form `col IN (elem0, elem1, ..., elemN)` only when `N` is very large, e.g., 1.6+ million, and when `col` exists in a hash-sharded index or exists a table with an indexed, computed column dependent on `col`.
mgartner
added a commit
to mgartner/cockroach
that referenced
this issue
Oct 21, 2024
Previously, a filter of the form `col IN (elem0, elem1, ..., elemN)` could result in a stack overflow in the optimizer when `N` is very large, e.g., 1.6 million or more . The problem would occur when trying to derive constraints for indexes on computed column, specifically when those computed columns are dependent on the column in the filter, e.g., `col` in the example. The current implementation builds an `OrExpr` tree with depth equal to the length of the `IN` list. It then constructs a `FiltersItem` with the `OrExpr` tree as the condition, causing logical properties to be built for the expression. When building logical properties, the `OrExpr` tree is traversed recursively, which causes the stack to overflow when the tree is very deep. Now, an `OrExpr` tree is never built. Instead, `CombineComputedColFilters` returns a slice of `ScalarExpr`s that represents a disjunction of expressions. This effectively eliminates the possibility of stack overflows in code that recursively traverses the derived expressions. It also simplified the logic. Fixes cockroachdb#132669 Release note (bug fix): A bug in the query optimizer which could cause CockroachDB nodes to crash in rare cases has been fixed. The bug could occur when a query contains a filter in the form `col IN (elem0, elem1, ..., elemN)` only when `N` is very large, e.g., 1.6+ million, and when `col` exists in a hash-sharded index or exists a table with an indexed, computed column dependent on `col`.
This was referenced Oct 21, 2024
mgartner
added a commit
to mgartner/cockroach
that referenced
this issue
Oct 21, 2024
Previously, a filter of the form `col IN (elem0, elem1, ..., elemN)` could result in a stack overflow in the optimizer when `N` is very large, e.g., 1.6 million or more . The problem would occur when trying to derive constraints for indexes on computed column, specifically when those computed columns are dependent on the column in the filter, e.g., `col` in the example. The current implementation builds an `OrExpr` tree with depth equal to the length of the `IN` list. It then constructs a `FiltersItem` with the `OrExpr` tree as the condition, causing logical properties to be built for the expression. When building logical properties, the `OrExpr` tree is traversed recursively, which causes the stack to overflow when the tree is very deep. Now, an `OrExpr` tree is never built. Instead, `CombineComputedColFilters` returns a slice of `ScalarExpr`s that represents a disjunction of expressions. This effectively eliminates the possibility of stack overflows in code that recursively traverses the derived expressions. It also simplified the logic. Fixes cockroachdb#132669 Release note (bug fix): A bug in the query optimizer which could cause CockroachDB nodes to crash in rare cases has been fixed. The bug could occur when a query contains a filter in the form `col IN (elem0, elem1, ..., elemN)` only when `N` is very large, e.g., 1.6+ million, and when `col` exists in a hash-sharded index or exists a table with an indexed, computed column dependent on `col`.
mgartner
added a commit
that referenced
this issue
Oct 22, 2024
Previously, a filter of the form `col IN (elem0, elem1, ..., elemN)` could result in a stack overflow in the optimizer when `N` is very large, e.g., 1.6 million or more . The problem would occur when trying to derive constraints for indexes on computed column, specifically when those computed columns are dependent on the column in the filter, e.g., `col` in the example. The current implementation builds an `OrExpr` tree with depth equal to the length of the `IN` list. It then constructs a `FiltersItem` with the `OrExpr` tree as the condition, causing logical properties to be built for the expression. When building logical properties, the `OrExpr` tree is traversed recursively, which causes the stack to overflow when the tree is very deep. Now, an `OrExpr` tree is never built. Instead, `CombineComputedColFilters` returns a slice of `ScalarExpr`s that represents a disjunction of expressions. This effectively eliminates the possibility of stack overflows in code that recursively traverses the derived expressions. It also simplified the logic. Fixes #132669 Release note (bug fix): A bug in the query optimizer which could cause CockroachDB nodes to crash in rare cases has been fixed. The bug could occur when a query contains a filter in the form `col IN (elem0, elem1, ..., elemN)` only when `N` is very large, e.g., 1.6+ million, and when `col` exists in a hash-sharded index or exists a table with an indexed, computed column dependent on `col`.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
branch-master
Failures and bugs on the master branch.
branch-release-22.2
Used to mark GA and release blockers, technical advisories, and bugs for 22.2
branch-release-23.1
Used to mark GA and release blockers, technical advisories, and bugs for 23.1
branch-release-23.2
Used to mark GA and release blockers, technical advisories, and bugs for 23.2
branch-release-24.1
Used to mark GA and release blockers, technical advisories, and bugs for 24.1
branch-release-24.2
Used to mark GA and release blockers, technical advisories, and bugs for 24.2
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
S-2-temp-unavailability
Temp crashes or other availability problems. Can be worked around or resolved by restarting.
T-sql-queries
SQL Queries Team
See the reproduction notes here: https://gist.github.com/mgartner/c06b0ad259e50344c93430ff48282355
Jira issue: CRDB-43218
The text was updated successfully, but these errors were encountered: