scpb,scdecomp,scbuild: remodel elements, use decomposition#76776
scpb,scdecomp,scbuild: remodel elements, use decomposition#76776craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
a6f7602 to
424e253
Compare
ajwerner
left a comment
There was a problem hiding this comment.
Okay, I didn't make it all the way through in detail, but I'll give you a follow-up on the train tomorrow. I feel very very good about this direction. Let's get some form of this merged Monday.
| return nil, err | ||
| } | ||
| for _, si := range seqIdents { | ||
| seqIDs.Add(descpb.ID(si.SeqID)) |
There was a problem hiding this comment.
This is assuming that all of the idents have been translated. Should we return an error if SeqName is in use here and SeqID is zero?
There was a problem hiding this comment.
I'm making that assumption more explicit in the contract of the scdecomp entry point.
There was a problem hiding this comment.
relying on this ordering is fine, but it is a bit subtle. Is it sufficiently commented on the interface / package?
There was a problem hiding this comment.
I'm not quite sure what this comment refers to but I hope that in my latest revision things will be more clear.
| panic(pgerror.Newf(pgcode.NoPrimaryKey, "missing active primary key")) | ||
| } | ||
| // Drop all existing primary index elements. | ||
| b.Drop(existing) |
There was a problem hiding this comment.
Do we want to make this call if we're already dropping existing? I'm fine with the answer being yes, just trying to get a feel for the paradigm.
There was a problem hiding this comment.
existing targets PUBLIC, but this just confirms that this code isn't super clear.
|
Thanks for taking a look! I'm still fixing bugs and will address your comments afterwards. I'm in broad agreement with all of what you say. I'm pushing a fixup commit with the new rules. |
ajwerner
left a comment
There was a problem hiding this comment.
Flushing out some comments I had queued from yesterday. Don't know if they're stale.
|
|
||
| message View { | ||
| uint32 table_id = 1 [(gogoproto.customname) = "TableID", (gogoproto.casttype) = "github.com/cockroachdb/cockroach/pkg/sql/sem/catid.DescID"]; | ||
| message SchemaParent { |
There was a problem hiding this comment.
This element deserves a comment because it's not obvious what it means. In general this whole file you benefit from commentary but this was stuck out as particularly non-obvious. I imagine this relates to the name entry of the schema.
I ran into trouble with this element in restore if you want to restore a database that has a dropped schema as part of a schema change, then you'll have this element around even though the schema descriptor doesn't exist. I think, on some level, that this isn't really a problem with this element. Instead I'm thinking that when elements reach their terminal state in the NonRevertiblePostCommitPhase that we ought to drop them entirely. There's nothing to here specifically.
I guess I'll share that a thought I was having at the time is that we could consider modeling the database schema entry as an implementation detail of the namespace entry and couple removing the entry from the database descriptor to removing the namespace entry. I can't imagine wanting to ever remove or add one without the other.
There was a problem hiding this comment.
I'm adding a comment to explain these things in elements.proto but the short of it is, SchemaParent is responsible for updating the schema backrefs in the parent db throughout its lifecyle: remove the backref when targeting ABSENT and vice-versa.
Nothing's hard and fast here and this could conceivably be handled alternatively by Namespace or even Schema but I found this to be nicest.
pkg/sql/schemachanger/scplan/internal/opgen/opgen_column_default_expression.go
Show resolved
Hide resolved
pkg/sql/schemachanger/scplan/internal/opgen/opgen_column_on_update_expression.go
Outdated
Show resolved
Hide resolved
| to(scpb.Status_PUBLIC, | ||
| emit(func(this *scpb.ColumnOnUpdateExpression) scop.Op { | ||
| return &scop.AddColumnOnUpdateExpression{ | ||
| OnUpdate: *protoutil.Clone(this).(*scpb.ColumnOnUpdateExpression), |
There was a problem hiding this comment.
I wonder if we ought to add a safety check that no pointer in the op is equal to the input to the generator function. Certainly wouldn't ask for that in this change, but I definitely worry about future writers of these not adhering to such things.
There was a problem hiding this comment.
Yes. In fact there's a whole slew of safety checks we should add. Most pressing would be some kind of guarantee that for a DROP statement we never end up marking a descriptor as dropped post-commit!
3f2e28f to
3f06b15
Compare
|
Well, here it is, ready for review. I know right now the diff weighs in at +18k -11k but most of that is in data-driven test files. @ajwerner @fqazi please take a look if you can. I've been looking at #76610 and I think we should be able to rebase most of it on top of this without too much fuss. @fqazi thoughts? I'm aware tests are still failing in the |
3f06b15 to
2af5c73
Compare
|
@postamar I'm okay with rebasing the enable by default stuff on top. I have a bunch of hairy stuff still left like a random hang that will take some time too. Let me look at this today |
fqazi
left a comment
There was a problem hiding this comment.
Took a quick look this stuff is definitely much cleaner. Only one comment didn't see anything major I would call a blocker
Reviewed 33 of 123 files at r1, 145 of 145 files at r2, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @postamar)
pkg/sql/schemachanger/scbuild/internal/scbuildstmt/alter_table_add_column.go, line 186 at r2 (raw file):
existing = idx if status == scpb.Status_ABSENT { freshlyAdded = idx
Should there ever be more than one of these?
2af5c73 to
4a777b7
Compare
|
Thanks for taking a look! I think I'm almost there with the tests.
I honestly don't know, I guess it's possible. Is it bad? Maybe not, as far as the ADD COLUMN is concerned. I added a comment. |
This change adresses some shortcomings of the declarative schema
changer that would have compromised its ability to evolve gracefully in
its current form. Basically these are:
1. the builder generating targets based off a hybrid descriptor-and-
element scheme, and
2. an awkward handling of back-references which precludes doing much
more than DROP statements correctly.
This change therefore consists of, respectively:
1. rewriting the builder to hide descriptors from the `scbuildstmt`
package, instead exposing sets of elements, and
2. reworking the element model definition by getting rid of
back-reference elements entirely.
In support of (1) this commit introduces a new `scdecomp` package which,
given a descriptor, will walk a visitor function over all its
constituent elements. This itself is made quite simple by (2) which
largely removes the need to look up referenced descriptors outside of
some mild edge-cases.
This `scdecomp` package is used by the backing structs of the
`scbuildstmt` dependency interfaces to collect a "state-of-the-world"
snapshot of elements upon which the schema change target elements are
layered. This approach lends itself well to caching as the descriptors
remain immutable in the builder.
The rewrite of most of `elements.proto` in support of (2) implies a slew
of cascading changes:
- the attributes in `screl` need to be adjusted,
- the lifecyles of the elements in `opgen` have to be adjusted,
- the dependency rules and no-op rules need to be adjusted,
- in turn, new operations need to be defined in `scop`, and,
- in turn, these operations need to be implemented in `scmutationexec`.
Elements are now responsible for updating any back-references that
correspond to their forward references, which effectively pushed that
complexity into these reference update ops in `scmutationexec`. These
have to operate on a best-effort basis, because there are no longer
back-reference elements with dependency rules to enforce convenient
assumptions about the context of their adding or removal. This is
arguably not a bad thing per-se: this new paradigm is "fail hard, fail
fast" which surfaces bugs a lot more quickly than a badly-written
dependency rule would.
The rules themselves fall into cleaner patterns. This commit provides some
tools to express the most common of these. This commit also unifies the
`deprules` and `scopt` packages into a commit `rules` package with full
data-driven test coverage.
Other changes in this commit are peripheral in nature:
- Working on this change surfaced some deficiencies in the
cross-referenced descriptor validation logic: we checked that the
referenced descriptor exists but not that it's not dropped. This
commit fixes this.
- The expression validation logic in `schemaexpr` quite reasonably
used to assume that columns can be found in descriptors;
unfortunately the builder needs to be able to use this for columns
which only exist as elements. This commit refactors the entry points
into this validation logic as a result.
- Quality-of-life improvements like adding a testing knob used to
panic TestRollback with an error with a full stack trace when the
rollback fails.
- Because back-references don't exist as elements anymore, they also
don't exist as targets, so we now have schema changer jobs linked to
descriptors for which there are zero targets. This commit addresses this
by simply allowing it. This is necessary to lock descriptors to
prevent any concurrent schema changes which would affect them.
Release note: None
4a777b7 to
69f0103
Compare
|
Yay, all tests pass! I get a test failure but it appears to be just a flake. bors r+ |
|
Build failed (retrying...): |
|
Build failed (retrying...): |
|
Build succeeded: |
76742: dev: don't exit immediately in ./dev ui watch r=irfansharif a=sjbarag ./dev ui watch previously didn't wait for the command context for webpack commands to complete before returning. This caused the 'watch' subcommand to stop immediately, then the 'ui' subcommand, then the entire 'dev' command to stop, exiting the entire process likely before webpack could even start. Wait for the webpack commands to complete (which they never should, since they're meant to run indefinitely) before returning from ./dev ui watch's implementation. Release note: None 76944: *: enable declarative schema changer r=postamar a=postamar This change enables the declarative schema changer by default. This PR originated as #76610 by `@fqazi` which I rebased following the merging of #76776. Only mild alterations were needed, these commits can therefore be considered as having been reviewed. Co-authored-by: Sean Barag <barag@cockroachlabs.com> Co-authored-by: Andrew Werner <awerner32@gmail.com> Co-authored-by: Faizan Qazi <faizan@cockroachlabs.com> Co-authored-by: Marius Posta <marius@cockroachlabs.com>
This change adresses some shortcomings of the declarative schema
changer that would have compromised its ability to evolve gracefully in
its current form. Basically these are:
element scheme, and
more than DROP statements correctly.
This change therefore consists of, respectively:
scbuildstmtpackage, instead exposing sets of elements, and
back-reference elements entirely.
In support of (1) this commit introduces a new
scdecomppackage which,given a descriptor, will walk a visitor function over all its
constituent elements. This itself is made quite simple by (2) which
largely removes the need to look up referenced descriptors outside of
some mild edge-cases.
This
scdecomppackage is used by the backing structs of thescbuildstmtdependency interfaces to collect a "state-of-the-world"snapshot of elements upon which the schema change target elements are
layered. This approach lends itself well to caching as the descriptors
remain immutable in the builder.
The rewrite of most of
elements.protoin support of (2) implies a slewof cascading changes:
screlneed to be adjusted,opgenhave to be adjusted,scop, and,scmutationexec.Elements are now responsible for updating any back-references that
correspond to their forward references, which effectively pushed that
complexity into these reference update ops in
scmutationexec. Thesehave to operate on a best-effort basis, because there are no longer
back-reference elements with dependency rules to enforce convenient
assumptions about the context of their adding or removal. This is
arguably not a bad thing per-se: this new paradigm is "fail hard, fail
fast" which surfaces bugs a lot more quickly than a badly-written
dependency rule would.
The rules themselves fall into cleaner patterns. This commit provides some
tools to express the most common of these. This commit also unifies the
deprulesandscoptpackages into a commitrulespackage with fulldata-driven test coverage.
Other changes in this commit are peripheral in nature:
cross-referenced descriptor validation logic: we checked that the
referenced descriptor exists but not that it's not dropped. This
commit fixes this.
schemaexprquite reasonablyused to assume that columns can be found in descriptors;
unfortunately the builder needs to be able to use this for columns
which only exist as elements. This commit refactors the entry points
into this validation logic as a result.
panic TestRollback with an error with a full stack trace when the
rollback fails.
don't exist as targets, so we now have schema changer jobs linked to
descriptors for which there are zero targets. This commit addresses this
by simply allowing it. This is necessary to lock descriptors to
prevent any concurrent schema changes which would affect them.
Release note: None