-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt: synthesize check constraints on enum columns #49284
Conversation
Left some todos in here that need to get resolved as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 7 of 7 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @RaduBerinde, and @rohany)
pkg/sql/opt_catalog.go, line 540 at r1 (raw file):
inboundFKs []optForeignKeyConstraint // checkConstraints is the set of synthesized check constraints for this
not just synthesized -- you've included all constraints below
pkg/sql/opt_catalog.go, line 663 at r1 (raw file):
} // Move all existing and synthesized checks into the opt table. ot.checkConstraints = make([]cat.CheckConstraint, 0, len(desc.Checks)+len(synthesizedChecks))
why were we looking at desc.ActiveChecks()
before, but now this is only looking at desc.Checks
?
pkg/sql/resolver.go, line 178 at r1 (raw file):
return nil, errors.AssertionFailedf("%s was not a type descriptor", rawDesc) } // TODO (rohany): Should we perform lookups on the parent database andschema
andschema -> and schema
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @RaduBerinde, and @rytaft)
pkg/sql/opt_catalog.go, line 540 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
not just synthesized -- you've included all constraints below
Done.
pkg/sql/opt_catalog.go, line 663 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
why were we looking at
desc.ActiveChecks()
before, but now this is only looking atdesc.Checks
?
Not sure -- looking at the code it seems to be the same thing, but I'll stay consistent.
pkg/sql/resolver.go, line 178 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
andschema -> and schema
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @RaduBerinde, and @rytaft)
pkg/sql/opt/exec/execbuilder/testdata/enums, line 54 at r2 (raw file):
└── [/'hi' - /'hi'] # TODO (rohany): Why is the NOT NULL important here? I couldn't get this
Do you know the answer to this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @RaduBerinde)
pkg/sql/opt/exec/execbuilder/testdata/enums, line 54 at r2 (raw file):
Previously, rohany (Rohan Yadav) wrote…
Do you know the answer to this?
I think we just haven't added support for it yet, but I think we probably could... @RaduBerinde probably knows better since he was working with check constraints and their interactions with nulls recently....
pkg/sql/opt/exec/execbuilder/testdata/enums, line 54 at r2 (raw file): Previously, rytaft (Rebecca Taft) wrote…
This comment is probably relevant:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @rohany)
pkg/sql/opt/exec/execbuilder/testdata/enums, line 54 at r2 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
This comment is probably relevant:
// Check constraints that are guaranteed to not evaluate to NULL // are the only ones converted into filters. This is because a NULL // constraint is interpreted as passing, whereas a NULL filter is not.
The problem is that CHECK (expr)
"passes" if the expression evaluates to NULL so we can't blindly convert it to a filter. We could do a better job by adding an OR x IS NULL
in this case (in general, in cases where the result is NULL iff a given variable is NULL).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @rohany)
pkg/sql/opt/exec/execbuilder/testdata/enums, line 54 at r2 (raw file):
Previously, RaduBerinde wrote…
The problem is that
CHECK (expr)
"passes" if the expression evaluates to NULL so we can't blindly convert it to a filter. We could do a better job by adding anOR x IS NULL
in this case (in general, in cases where the result is NULL iff a given variable is NULL).
If you want to play around with that, this is where that logic would live: https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/opt/optbuilder/select.go#L599
TFTRs! I'm going to hold off on merging this until some other things I'm working on answer the TODO's in here.
Sure, it would be fun to take a look at that after. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome - can't wait to see this working in concert with partitioning!
I've been thinking about how these checks can interact with the It seems like the check constraints in the optimizer can catch most cases of attempting to insert a value that is only in the readable state. However, we cannot restrict this check constraint to be only the readable values of the enum -- another node with the value in the writeable state could have inserted that value before the current node has the value in the writeable state. So it seems like we want to instead have a notion of "checks to enforce on reads" and "checks to enforce on writes" in the optimizer. Does this make sense? |
hey @RaduBerinde, @rytaft did you get a chance to read my comment here? |
That sounds right to me, we would need the check constraints to enforce on writes to be different than the check constraints that we can use to make assumptions about existing data. They are kind of different already, in that the latter are a subset (because of non-validated constraints, or because of NULL handling in constraint conditions). |
As a potential (maybe temporary) solution to avoid correctness issues, we can use a big hammer and treat any constraint involving an enum that is being changed as non-validated. |
I think that this is the correct solution. It looks like the optimizer already has logic of using only validated constraints on reads, so I can just piggyback onto that here. I'm planning on just synthesizing an extra unvalidated constraint containing the "read only" enum members to be verified on mutation operations, and then keeping the validated constraint with all members of the enum to be used on reads. Given that I don't have to do a bigger change here when Can y'all take a quick second look at this? |
This PR teaches the optimizer how to synthesize check constraints on columns of an ENUM type, allowing queries like: ``` CREATE TYPE t AS ENUM ('howdy', 'hello'); CREATE TABLE tt (x t, y INT, PRIMARY KEY (x, y)); SELECT x, y FROM tt WHERE y = 2 ``` to be planned using constrained spans on the enum values, rather than a full table scan. Release note (performance improvement): Allow the optimizer to use enum information to generate better query plans.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 7 of 7 files at r3.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @rohany)
TFTR! bors r+ |
Build succeeded |
Fixes #49263.
This PR teaches the optimizer how to synthesize check constraints on
columns of an ENUM type, allowing queries like:
to be planned using constrained spans on the enum values, rather than a
full table scan.
Release note (performance improvement): Allow the optimizer to use enum
information to generate better query plans.