Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql/schemachanger: prevent concurrent type desc changes #113639

Merged
merged 1 commit into from
Nov 2, 2023

Conversation

fqazi
Copy link
Collaborator

@fqazi fqazi commented Nov 1, 2023

Previously, the declarative schema changer only waited for legacy schema changer mutations/jobs on relations and did not correctly account for type schema changes. As a result, it was possible to drop a type concurrently while a schema change was trying to mutate a type, which could lead to type change jobs hanging. To address this, this patch causes declarative schema changes to wait for type modifications.

Fixes: #112390, #111540

Release note (bug fix): ALTER TYPE could get stuck if DROP TYPE
was executed concurrently.

@fqazi fqazi requested a review from a team as a code owner November 1, 2023 21:27
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@fqazi fqazi added the backport-23.2.x Flags PRs that need to be backported to 23.2. label Nov 1, 2023
@rafiss rafiss requested a review from chrisseto November 1, 2023 21:28
Copy link
Contributor

@chrisseto chrisseto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix LGTM but the test is a bit mind bending to me in its current state.

ctx, cancel := context.WithCancel(context.Background())
dropTypeChan := make(chan struct{})
addTypeValueChan := make(chan struct{})
var closeAlterPKChanOnce sync.Once
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this name mean? There doesn't seem to be any Alter PK going on?

if (len(stmts) == 1 && strings.Contains(stmts[0], "DROP TYPE")) ||
stmts == nil {
closeAlterPKChanOnce.Do(func() {
close(addTypeValueChan)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: These channel names feel backwards? I would expect this one to be the dropType channel. Alternatively you could give them past tense names like typevalueAddedChan?

@@ -502,6 +562,10 @@ func TestSchemaChangeWaitsForConcurrentSchemaChanges(t *testing.T) {
tf(t, sessiondatapb.UseNewSchemaChangerOff, sessiondatapb.UseNewSchemaChangerUnsafeAlways)
})

t.Run("legacy-then-declarative", func(t *testing.T) {
typeSchemaChangeFunc(t, sessiondatapb.UseNewSchemaChangerOff, sessiondatapb.UseNewSchemaChangerUnsafeAlways)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to not run this in the other direction as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Give this a unique t.Run name

// it will eventually be able to proceed after waiting for a while.
tdb.Exec(t, `SET use_declarative_schema_changer = $1`, modeFor2ndStmt.String())
tdb.Exec(t, `DROP TYPE STATUS;`)
require.True(t, schemaChangeWaitDetected, "no schema change wait was observed.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a bit race-y to me. schemaChangeWaitDetected might not immediately flipped, right? It might be more straightforward to close another channel and block on that with a timeout?

@chrisseto
Copy link
Contributor

Side note: Is there a reason that type descriptors don't have a Mutations field? 🤔

Previously, the declarative schema changer only waited for legacy schema
changer mutations/jobs on relations and did not correctly account for
type schema changes.  As a result, it was possible to drop a type
concurrently while a schema change was trying to mutate a type, which
could lead to type change jobs hanging. To address this, this patch
causes declarative schema changes to wait for type modifications.

Fixes: cockroachdb#112390, cockroachdb#111540

Release note (bug fix): ALTER TYPE could get stuck if DROP TYPE
 was executed concurrently.
@fqazi fqazi force-pushed the concurrenTypeChange branch from e709842 to b443797 Compare November 2, 2023 01:00
Copy link
Collaborator Author

@fqazi fqazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note: Is there a reason that type descriptors don't have a Mutations field? 🤔

Honestly, they probably should have since it would simplify code and be more consistent. Assuming things are using accessors we probably can improve things with declarative work.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @chrisseto)


pkg/sql/schemachanger/schemachanger_test.go line 497 at r1 (raw file):

Previously, chrisseto (Chris Seto) wrote…

Why does this name mean? There doesn't seem to be any Alter PK going on?

Done.


pkg/sql/schemachanger/schemachanger_test.go line 510 at r1 (raw file):

Previously, chrisseto (Chris Seto) wrote…

nit: These channel names feel backwards? I would expect this one to be the dropType channel. Alternatively you could give them past tense names like typevalueAddedChan?

Good point think these are clearer:

addTypeStartedChan := make(chan struct{})  
resumeAddTypeJobChan := make(chan struct{})

pkg/sql/schemachanger/schemachanger_test.go line 550 at r1 (raw file):

Previously, chrisseto (Chris Seto) wrote…

This feels a bit race-y to me. schemaChangeWaitDetected might not immediately flipped, right? It might be more straightforward to close another channel and block on that with a timeout?

Yeah, I'll be more paranoid and close another channel.


pkg/sql/schemachanger/schemachanger_test.go line 566 at r1 (raw file):

Any reason to not run this in the other direction as well?

We can't until we implement these alters in the declarative world.

nit: Give this a unique t.Run name

Renamed this to: typedesc legacy-then-declarative

@fqazi
Copy link
Collaborator Author

fqazi commented Nov 2, 2023

@chrisseto TFTR!

bors r+

@craig
Copy link
Contributor

craig bot commented Nov 2, 2023

Build succeeded:

Copy link
Contributor

@Xiang-Gu Xiang-Gu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we cannot amend the existing tf function to include testing for type descriptor because we don't have create type, alter type in declarative schema changer yet. Is this the reason you created a new helper function typeSchemaChangeFunc?

If so, then I'd suggest we move this new test into a new, top-level test (instead of a subtest here), maybe called TestTypeSchemaChangesWaitForConcurrentTypeSchemaChanges, which is cleaner IMO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-23.2.x Flags PRs that need to be backported to 23.2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pkg/ccl/testccl/workload/schemachange/schemachange_test: TestWorkload failed
4 participants