Skip to content

scpb,scdecomp,scbuild: remodel elements, use decomposition#76776

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
postamar:element-remodel
Feb 23, 2022
Merged

scpb,scdecomp,scbuild: remodel elements, use decomposition#76776
craig[bot] merged 1 commit intocockroachdb:masterfrom
postamar:element-remodel

Conversation

@postamar
Copy link

@postamar postamar commented Feb 18, 2022

This change adresses some shortcomings of the declarative schema
changer that would have compromised its ability to evolve gracefully in
its current form. Basically these are:

  1. the builder generating targets based off a hybrid descriptor-and-
    element scheme, and
  2. an awkward handling of back-references which precludes doing much
    more than DROP statements correctly.

This change therefore consists of, respectively:

  1. rewriting the builder to hide descriptors from the scbuildstmt
    package, instead exposing sets of elements, and
  2. reworking the element model definition by getting rid of
    back-reference elements entirely.

In support of (1) this commit introduces a new scdecomp package which,
given a descriptor, will walk a visitor function over all its
constituent elements. This itself is made quite simple by (2) which
largely removes the need to look up referenced descriptors outside of
some mild edge-cases.

This scdecomp package is used by the backing structs of the
scbuildstmt dependency interfaces to collect a "state-of-the-world"
snapshot of elements upon which the schema change target elements are
layered. This approach lends itself well to caching as the descriptors
remain immutable in the builder.

The rewrite of most of elements.proto in support of (2) implies a slew
of cascading changes:

  • the attributes in screl need to be adjusted,
  • the lifecyles of the elements in opgen have to be adjusted,
  • the dependency rules and no-op rules need to be adjusted,
  • in turn, new operations need to be defined in scop, and,
  • in turn, these operations need to be implemented in scmutationexec.

Elements are now responsible for updating any back-references that
correspond to their forward references, which effectively pushed that
complexity into these reference update ops in scmutationexec. These
have to operate on a best-effort basis, because there are no longer
back-reference elements with dependency rules to enforce convenient
assumptions about the context of their adding or removal. This is
arguably not a bad thing per-se: this new paradigm is "fail hard, fail
fast" which surfaces bugs a lot more quickly than a badly-written
dependency rule would.

The rules themselves fall into cleaner patterns. This commit provides some
tools to express the most common of these. This commit also unifies the
deprules and scopt packages into a commit rules package with full
data-driven test coverage.

Other changes in this commit are peripheral in nature:

  • Working on this change surfaced some deficiencies in the
    cross-referenced descriptor validation logic: we checked that the
    referenced descriptor exists but not that it's not dropped. This
    commit fixes this.
  • The expression validation logic in schemaexpr quite reasonably
    used to assume that columns can be found in descriptors;
    unfortunately the builder needs to be able to use this for columns
    which only exist as elements. This commit refactors the entry points
    into this validation logic as a result.
  • Quality-of-life improvements like adding a testing knob used to
    panic TestRollback with an error with a full stack trace when the
    rollback fails.
  • Because back-references don't exist as elements anymore, they also
    don't exist as targets, so we now have schema changer jobs linked to
    descriptors for which there are zero targets. This commit addresses this
    by simply allowing it. This is necessary to lock descriptors to
    prevent any concurrent schema changes which would affect them.

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@postamar postamar requested a review from a team February 18, 2022 11:11
Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rendered the plantuml.

Okay, I didn't make it all the way through in detail, but I'll give you a follow-up on the train tomorrow. I feel very very good about this direction. Let's get some form of this merged Monday.

return nil, err
}
for _, si := range seqIdents {
seqIDs.Add(descpb.ID(si.SeqID))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is assuming that all of the idents have been translated. Should we return an error if SeqName is in use here and SeqID is zero?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm making that assumption more explicit in the contract of the scdecomp entry point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

relying on this ordering is fine, but it is a bit subtle. Is it sufficiently commented on the interface / package?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure what this comment refers to but I hope that in my latest revision things will be more clear.

panic(pgerror.Newf(pgcode.NoPrimaryKey, "missing active primary key"))
}
// Drop all existing primary index elements.
b.Drop(existing)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to make this call if we're already dropping existing? I'm fine with the answer being yes, just trying to get a feel for the paradigm.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

existing targets PUBLIC, but this just confirms that this code isn't super clear.

@postamar
Copy link
Author

Thanks for taking a look! I'm still fixing bugs and will address your comments afterwards. I'm in broad agreement with all of what you say.

I'm pushing a fixup commit with the new rules.

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flushing out some comments I had queued from yesterday. Don't know if they're stale.


message View {
uint32 table_id = 1 [(gogoproto.customname) = "TableID", (gogoproto.casttype) = "github.com/cockroachdb/cockroach/pkg/sql/sem/catid.DescID"];
message SchemaParent {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This element deserves a comment because it's not obvious what it means. In general this whole file you benefit from commentary but this was stuck out as particularly non-obvious. I imagine this relates to the name entry of the schema.

I ran into trouble with this element in restore if you want to restore a database that has a dropped schema as part of a schema change, then you'll have this element around even though the schema descriptor doesn't exist. I think, on some level, that this isn't really a problem with this element. Instead I'm thinking that when elements reach their terminal state in the NonRevertiblePostCommitPhase that we ought to drop them entirely. There's nothing to here specifically.

I guess I'll share that a thought I was having at the time is that we could consider modeling the database schema entry as an implementation detail of the namespace entry and couple removing the entry from the database descriptor to removing the namespace entry. I can't imagine wanting to ever remove or add one without the other.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm adding a comment to explain these things in elements.proto but the short of it is, SchemaParent is responsible for updating the schema backrefs in the parent db throughout its lifecyle: remove the backref when targeting ABSENT and vice-versa.
Nothing's hard and fast here and this could conceivably be handled alternatively by Namespace or even Schema but I found this to be nicest.

to(scpb.Status_PUBLIC,
emit(func(this *scpb.ColumnOnUpdateExpression) scop.Op {
return &scop.AddColumnOnUpdateExpression{
OnUpdate: *protoutil.Clone(this).(*scpb.ColumnOnUpdateExpression),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we ought to add a safety check that no pointer in the op is equal to the input to the generator function. Certainly wouldn't ask for that in this change, but I definitely worry about future writers of these not adhering to such things.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. In fact there's a whole slew of safety checks we should add. Most pressing would be some kind of guarantee that for a DROP statement we never end up marking a descriptor as dropped post-commit!

@postamar
Copy link
Author

Well, here it is, ready for review. I know right now the diff weighs in at +18k -11k but most of that is in data-driven test files. @ajwerner @fqazi please take a look if you can.

I've been looking at #76610 and I think we should be able to rebase most of it on top of this without too much fuss. @fqazi thoughts?

I'm aware tests are still failing in the local-declarative-schema config and that this means this commit still has bugs.

@fqazi
Copy link
Collaborator

fqazi commented Feb 22, 2022

@postamar I'm okay with rebasing the enable by default stuff on top. I have a bunch of hairy stuff still left like a random hang that will take some time too. Let me look at this today

Copy link
Collaborator

@fqazi fqazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a quick look this stuff is definitely much cleaner. Only one comment didn't see anything major I would call a blocker

Reviewed 33 of 123 files at r1, 145 of 145 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @postamar)


pkg/sql/schemachanger/scbuild/internal/scbuildstmt/alter_table_add_column.go, line 186 at r2 (raw file):

		existing = idx
		if status == scpb.Status_ABSENT {
			freshlyAdded = idx

Should there ever be more than one of these?

@postamar
Copy link
Author

Thanks for taking a look! I think I'm almost there with the tests.

Should there ever be more than one of these?

I honestly don't know, I guess it's possible. Is it bad? Maybe not, as far as the ADD COLUMN is concerned. I added a comment.

This change adresses some shortcomings of the declarative schema
changer that would have compromised its ability to evolve gracefully in
its current form. Basically these are:
  1. the builder generating targets based off a hybrid descriptor-and-
     element scheme, and
  2. an awkward handling of back-references which precludes doing much
     more than DROP statements correctly.

This change therefore consists of, respectively:
  1. rewriting the builder to hide descriptors from the `scbuildstmt`
     package, instead exposing sets of elements, and
  2. reworking the element model definition by getting rid of
     back-reference elements entirely.

In support of (1) this commit introduces a new `scdecomp` package which,
given a descriptor, will walk a visitor function over all its
constituent elements. This itself is made quite simple by (2) which
largely removes the need to look up referenced descriptors outside of
some mild edge-cases.

This `scdecomp` package is used by the backing structs of the
`scbuildstmt` dependency interfaces to collect a "state-of-the-world"
snapshot of elements upon which the schema change target elements are
layered. This approach lends itself well to caching as the descriptors
remain immutable in the builder.

The rewrite of most of `elements.proto` in support of (2) implies a slew
of cascading changes:
 - the attributes in `screl` need to be adjusted,
 - the lifecyles of the elements in `opgen` have to be adjusted,
 - the dependency rules and no-op rules need to be adjusted,
 - in turn, new operations need to be defined in `scop`, and,
 - in turn, these operations need to be implemented in `scmutationexec`.

Elements are now responsible for updating any back-references that
correspond to their forward references, which effectively pushed that
complexity into these reference update ops in `scmutationexec`. These
have to operate on a best-effort basis, because there are no longer
back-reference elements with dependency rules to enforce convenient
assumptions about the context of their adding or removal. This is
arguably not a bad thing per-se: this new paradigm is "fail hard, fail
fast" which surfaces bugs a lot more quickly than a badly-written
dependency rule would.

The rules themselves fall into cleaner patterns. This commit provides some
tools to express the most common of these. This commit also unifies the
`deprules` and `scopt` packages into a commit `rules` package with full
data-driven test coverage.

Other changes in this commit are peripheral in nature:
  - Working on this change surfaced some deficiencies in the
    cross-referenced descriptor validation logic: we checked that the
    referenced descriptor exists but not that it's not dropped. This
    commit fixes this.
  - The expression validation logic in `schemaexpr` quite reasonably
    used to assume that columns can be found in descriptors;
    unfortunately the builder needs to be able to use this for columns
    which only exist as elements. This commit refactors the entry points
    into this validation logic as a result.
  - Quality-of-life improvements like adding a testing knob used to
    panic TestRollback with an error with a full stack trace when the
    rollback fails.
  - Because back-references don't exist as elements anymore, they also
    don't exist as targets, so we now have schema changer jobs linked to
    descriptors for which there are zero targets. This commit addresses this
    by simply allowing it. This is necessary to lock descriptors to
    prevent any concurrent schema changes which would affect them.

Release note: None
@postamar postamar marked this pull request as ready for review February 22, 2022 21:23
@postamar postamar requested a review from a team as a code owner February 22, 2022 21:23
@postamar
Copy link
Author

Yay, all tests pass! I get a test failure but it appears to be just a flake.

bors r+

@craig
Copy link
Contributor

craig bot commented Feb 23, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Feb 23, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Feb 23, 2022

Build succeeded:

@craig craig bot merged commit 5522769 into cockroachdb:master Feb 23, 2022
@postamar postamar deleted the element-remodel branch February 23, 2022 12:19
craig bot pushed a commit that referenced this pull request Feb 24, 2022
76742: dev: don't exit immediately in ./dev ui watch r=irfansharif a=sjbarag

./dev ui watch previously didn't wait for the command context for webpack commands to complete before returning. This caused the 'watch' subcommand to stop immediately, then the 'ui' subcommand, then the entire 'dev' command to stop, exiting the entire process likely before webpack could even start. Wait for the webpack commands to complete (which they never should, since they're meant to run indefinitely) before returning from ./dev ui watch's implementation.

Release note: None

76944: *: enable declarative schema changer r=postamar a=postamar

This change enables the declarative schema changer by default. This PR originated as #76610 by `@fqazi` which I rebased following the merging of #76776. Only mild alterations were needed, these commits can therefore be considered as having been reviewed. 

Co-authored-by: Sean Barag <barag@cockroachlabs.com>
Co-authored-by: Andrew Werner <awerner32@gmail.com>
Co-authored-by: Faizan Qazi <faizan@cockroachlabs.com>
Co-authored-by: Marius Posta <marius@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

Comments