Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: stop duplicating regions on the database descriptor #61585

Merged
merged 1 commit into from
Mar 17, 2021

Conversation

arulajmani
Copy link
Collaborator

Previously, we were duplicating regions both on the type descriptor and
the database descriptor. Theres a verbose explanation about why this is
bad on the linked issue, but the crux of the problem with this approach
is the added complexity it requires keeping these two lists in sync as
regions are added/dropped from a database in the face of possible
failure scenarios.

This patch removes the Regions field from the database descriptors's
region config. Instead, the regions stored on the multi-region enum
serve as the sole source of truth. To accomplish this, we introduce a
new RegionConfig struct which is synthesized from the state (primary
region, survival goal) stored on the database and the state (available
regions) stored on the type descriptor. All the places that constructed
zone configurations from the raw region config proto now use this new
struct. This patch also introduces a InitializationRegionConfig
struct which wraps a RegionConfig with initilization specific
meta-data.

I'm not too thrilled by the fact that survival goal and primary region
is still stored on the database descriptor. Ideally, the database
descriptor would contain just the multi-region enum ID and all user
configured parameters would be in one place (on the type descriptor's
region config). The level of indirection introduced by this patch sets
us on the path to that future.

Closes #60620

Release justification: low risk updates to new functionality
Release note: None

@arulajmani arulajmani requested review from ajwerner, ajstorm, a team and pbardea and removed request for a team March 6, 2021 02:05
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first pass, getting pulled away

Reviewed 5 of 35 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @arulajmani, and @pbardea)


pkg/ccl/multiregionccl/multiregion.go, line 102 at r1 (raw file):

Quoted 4 lines of code…
	regionEnumID, err := catalogkv.GenerateUniqueDescID(ctx, execCfg.DB, execCfg.Codec)
	if err != nil {
		return nil, err
	}

this feels rather cruft to have a separate multiregion struct just to carry an ID you create right here but don't hook into anything? Why even generate it here rather than one level up?


pkg/sql/create_table.go, line 415 at r1 (raw file):

	if desc.LocalityConfig != nil {
		_, dbDesc, err := params.p.Descriptors().GetImmutableDatabaseByID(

don't you want this one to avoid the cache?


pkg/sql/region_util.go, line 805 at r1 (raw file):

// configured state of a multi-region database by coallescing state from both
// the database descriptor and multi-region type descriptor.
func SynthesizeRegionConfig(

note that this always avoids the cache and should only ever be used in DDLs.

in fact, I don't think this works: this is going to get called inside the builtin on every single DML operation that uses the default_gateway() builtin. That isn't going to fly.


pkg/sql/catalog/multiregion/region_config.go, line 11 at r1 (raw file):

// licenses/APL.txt.

package multiregion

doc comment please.

Package multiregion provides ...

pkg/sql/catalog/multiregion/region_config.go, line 80 at r1 (raw file):

// EnsureSurvivalGoalIsSatisfiable returns an error if the survivability goal is
// unsatisfiable.
func (r *RegionConfig) EnsureSurvivalGoalIsSatisfiable() error {

nit: this feels like a simple function to me rather than a method. More of a style thing but I like to keep method sets to be more about encapsulated state and mutations to that state and not about derived facts.


pkg/sql/catalog/multiregion/region_config.go, line 100 at r1 (raw file):

// InitializationRegionConfig is a wrapper around RegionConfig containing
// additional fields which are only required during initialization.
type InitializationRegionConfig struct {

TODO: come back to whether having both structs makes sense; it smells


pkg/sql/catalog/multiregion/region_config.go, line 143 at r1 (raw file):

Quoted 8 lines of code…
	if config.survivalGoal == descpb.SurvivalGoal_REGION_FAILURE &&
		len(config.regions) < minNumRegionsForSurviveRegionGoal {
		return pgerror.Newf(
			pgcode.InvalidParameterValue,
			"at least %d regions are required for surviving a region failure",
			minNumRegionsForSurviveRegionGoal,
		)
	}

How is this different from the survivability method above?


pkg/sql/catalog/typedesc/type_desc.go, line 645 at r1 (raw file):

		vea.Report(err)
	}
	found := false

nit: put this in a block

Copy link
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @ajwerner, @arulajmani, and @pbardea)


pkg/sql/region_util.go, line 805 at r1 (raw file):

Previously, ajwerner wrote…

note that this always avoids the cache and should only ever be used in DDLs.

in fact, I don't think this works: this is going to get called inside the builtin on every single DML operation that uses the default_gateway() builtin. That isn't going to fly.

Yeah, I changed the default_gateway stuff the last and missed this. I'm thinking of offering two different variants of this method -- one that uses the leased descriptors and one that always avoids the cache (for DDLs). This signature should probably change to take in a DatabaseID instead of a Immutable as well.


pkg/sql/catalog/multiregion/region_config.go, line 100 at r1 (raw file):

Previously, ajwerner wrote…

TODO: come back to whether having both structs makes sense; it smells

I think we can get by with just one struct that includes all the fields.


pkg/sql/catalog/multiregion/region_config.go, line 143 at r1 (raw file):

Previously, ajwerner wrote…
	if config.survivalGoal == descpb.SurvivalGoal_REGION_FAILURE &&
		len(config.regions) < minNumRegionsForSurviveRegionGoal {
		return pgerror.Newf(
			pgcode.InvalidParameterValue,
			"at least %d regions are required for surviving a region failure",
			minNumRegionsForSurviveRegionGoal,
		)
	}

How is this different from the survivability method above?

Just error messages/hints, I'll reorganize it.

Copy link
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/ccl/multiregionccl/multiregion.go, line 102 at r1 (raw file):

Previously, ajwerner wrote…
	regionEnumID, err := catalogkv.GenerateUniqueDescID(ctx, execCfg.DB, execCfg.Codec)
	if err != nil {
		return nil, err
	}

this feels rather cruft to have a separate multiregion struct just to carry an ID you create right here but don't hook into anything? Why even generate it here rather than one level up?

Still TODO


pkg/sql/create_table.go, line 415 at r1 (raw file):

Previously, ajwerner wrote…

don't you want this one to avoid the cache?

Yeah, there were a few places where I (or the earlier code) wasn't principled about using AvoidCached. I think changing the signature of SynthesizeRegionConfig to take a database ID instead of a descriptor and forcing AvoidCached there makes things much better now.

This furthers my resolve that we should stop storing all but the region enum ID field on database descriptor. Then, it would be sufficient to grab an immutable database descriptor and just go to the store for the type descriptor when constructing this RegionConfig struct. What do you think?


pkg/sql/region_util.go, line 805 at r1 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

Yeah, I changed the default_gateway stuff the last and missed this. I'm thinking of offering two different variants of this method -- one that uses the leased descriptors and one that always avoids the cache (for DDLs). This signature should probably change to take in a DatabaseID instead of a Immutable as well.

Slight change from when I initially commented this -- I didn't create two functions, instead I made the DML specific function not call into this and use leased descriptors instead.


pkg/sql/catalog/multiregion/region_config.go, line 11 at r1 (raw file):

Previously, ajwerner wrote…

doc comment please.

Package multiregion provides ...

Done.


pkg/sql/catalog/multiregion/region_config.go, line 100 at r1 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

I think we can get by with just one struct that includes all the fields.

Okay, got rid of this struct entirely.


pkg/sql/catalog/multiregion/region_config.go, line 143 at r1 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

Just error messages/hints, I'll reorganize it.

Done.

@arulajmani arulajmani reopened this Mar 9, 2021
@arulajmani
Copy link
Collaborator Author

Had some git trouble ughg. Should be all good now, I'll give this another pass tomorrow for any stray changes/the one comment I still haven't addressed.

Copy link
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @ajwerner, and @pbardea)


pkg/ccl/multiregionccl/multiregion.go, line 102 at r1 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

Still TODO

Looking at this finally -- does your concern still stand now that there isn't a separate multi-region struct here and ID has been folded into RegionConfig?

Copy link
Contributor

@pbardea pbardea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signing off import/backup changes.

Copy link
Contributor

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've noticed you've decided to make structs take in and return the pointer form *multiregion.RegionConfig instead of multiregion.RegionConfig. any reason for the pointer? using the non-pointer is a lot safer as it has better "const correctness" as it always passes copies of the region config down. i'd prefer the latter.

Reviewed 14 of 35 files at r1, 20 of 20 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @ajwerner, and @arulajmani)


pkg/sql/create_table.go, line 1536 at r2 (raw file):

	n *tree.CreateTable,
	parentID, parentSchemaID, id descpb.ID,
	regionEnumID descpb.ID,

why does this need to take in an ID if regionConfig already takes in an ID?


pkg/sql/catalog/descpb/structured.proto, line 1228 at r2 (raw file):

    }

    repeated Region regions = 1 [(gogoproto.nullable)=false];

can you make this reserved 1;


pkg/sql/catalog/multiregion/region_config.go, line 86 at r2 (raw file):

// CanSatisfySurvivalGoal returns true if the survival goal is satisfiable by
// the given region config.
func CanSatisfySurvivalGoal(r *RegionConfig) bool {

any reason this is not a method on RegionConfig?


pkg/sql/catalog/multiregion/region_config.go, line 96 at r2 (raw file):

// ValidateRegionConfig validates that the given RegionConfig is valid.
func ValidateRegionConfig(config *RegionConfig) error {

can we add tests for this?
any reason this is not a method on RegionConfig?

@otan
Copy link
Contributor

otan commented Mar 15, 2021

I've noticed you've decided to make structs take in and return the pointer form *multiregion.RegionConfig instead of multiregion.RegionConfig. any reason for the pointer?

happy to take care of this in a separate commit if it helps
edit: on second thoughts, we rely on it being nil in some places. whatever, let's leave it!

Copy link
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RFAL

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @ajwerner, and @otan)


pkg/sql/create_table.go, line 1536 at r2 (raw file):

Previously, otan (Oliver Tan) wrote…

why does this need to take in an ID if regionConfig already takes in an ID?

Vestige of a time gone by when the regionConfig didn't have an ID -- cleaning it up.


pkg/sql/catalog/descpb/structured.proto, line 1228 at r2 (raw file):

Previously, otan (Oliver Tan) wrote…

can you make this reserved 1;

Done.


pkg/sql/catalog/multiregion/region_config.go, line 86 at r2 (raw file):

Previously, otan (Oliver Tan) wrote…

any reason this is not a method on RegionConfig?

@ajwerner's rec:

nit: this feels like a simple function to me rather than a method. More of a style thing but I like to keep method sets to be more about encapsulated state and mutations to that state and not about derived facts.


pkg/sql/catalog/multiregion/region_config.go, line 96 at r2 (raw file):

can we add tests for this?

Done.

any reason this is not a method on RegionConfig?
Same reasoning as above?

@arulajmani
Copy link
Collaborator Author

I've noticed you've decided to make structs take in and return the pointer form *multiregion.RegionConfig instead of multiregion.RegionConfig. any reason for the pointer? using the non-pointer is a lot safer as it has better "const correctness" as it always passes copies of the region config down. i'd prefer the latter

The fields on RegionConfig are private and we only expose getters on it. Does that alleviate your concerns?

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on passing around a value rather than a pointer

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @ajwerner, and @otan)


pkg/sql/catalog/multiregion/region_config.go, line 86 at r2 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

@ajwerner's rec:

nit: this feels like a simple function to me rather than a method. More of a style thing but I like to keep method sets to be more about encapsulated state and mutations to that state and not about derived facts.

🤷 why is it exported?

Copy link
Contributor

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 31 of 31 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @ajwerner, and @otan)

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel that strongly about this method function debate. I think I'm cool with this.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @ajwerner, @arulajmani, and @otan)


pkg/ccl/multiregionccl/multiregion.go, line 102 at r1 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

Looking at this finally -- does your concern still stand now that there isn't a separate multi-region struct here and ID has been folded into RegionConfig?

I'm okay with it this way. Should we call validate here?


pkg/sql/alter_database.go, line 817 at r3 (raw file):

Quoted 10 lines of code…
	if !multiregion.CanSatisfySurvivalGoal(regionConfig) {
		return errors.WithHintf(
			pgerror.Newf(pgcode.InvalidName,
				"at least %d regions are required for surviving a region failure",
				multiregion.MinNumRegionsForSurviveRegionGoal,
			),
			"you must add additional regions to the database using "+
				"ALTER DATABASE <db_name> ADD REGION <region_name>",
		)
	}

Should this just call the Validate function?


pkg/sql/create_table.go, line 415 at r1 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

Yeah, there were a few places where I (or the earlier code) wasn't principled about using AvoidCached. I think changing the signature of SynthesizeRegionConfig to take a database ID instead of a descriptor and forcing AvoidCached there makes things much better now.

This furthers my resolve that we should stop storing all but the region enum ID field on database descriptor. Then, it would be sufficient to grab an immutable database descriptor and just go to the store for the type descriptor when constructing this RegionConfig struct. What do you think?

Is the idea that we'd resolve it by name? I'm fine with that.


pkg/sql/descriptor.go, line 305 at r3 (raw file):

	}

	if err := multiregion.ValidateRegionConfig(regionConfig); err != nil {

Push this into the above call?


pkg/sql/region_util.go, line 805 at r1 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

Slight change from when I initially commented this -- I didn't create two functions, instead I made the DML specific function not call into this and use leased descriptors instead.

Note that this does not validate the config and that that's fine and good for not freaking out in the case that an invalid region config does come to exist. We seem to only be validating at construction time.


pkg/sql/catalog/multiregion/region_config.go, line 86 at r2 (raw file):

Previously, ajwerner wrote…

🤷 why is it exported?

Maybe in the one place, we could call validate?

Copy link
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yo @otan I left the method/function thing as is, cuz inertia, but lemme know if you want me to change it and I can. RFAL.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @ajwerner, and @otan)


pkg/ccl/multiregionccl/multiregion.go, line 102 at r1 (raw file):

Previously, ajwerner wrote…

I'm okay with it this way. Should we call validate here?

Done


pkg/sql/alter_database.go, line 817 at r3 (raw file):

Previously, ajwerner wrote…
	if !multiregion.CanSatisfySurvivalGoal(regionConfig) {
		return errors.WithHintf(
			pgerror.Newf(pgcode.InvalidName,
				"at least %d regions are required for surviving a region failure",
				multiregion.MinNumRegionsForSurviveRegionGoal,
			),
			"you must add additional regions to the database using "+
				"ALTER DATABASE <db_name> ADD REGION <region_name>",
		)
	}

Should this just call the Validate function?

Needed to re-jig some context specific hints into a generic one, but yeah, now Im just calling validate.


pkg/sql/create_table.go, line 415 at r1 (raw file):

Previously, ajwerner wrote…

Is the idea that we'd resolve it by name? I'm fine with that.

Still resolving it by ID, but explicitly resolving in SynthesizeRegionConfig instead of it expecting a descriptor.


pkg/sql/region_util.go, line 805 at r1 (raw file):

Previously, ajwerner wrote…

Note that this does not validate the config and that that's fine and good for not freaking out in the case that an invalid region config does come to exist. We seem to only be validating at construction time.

The type descriptor and database descriptor are taught to sanity check multi-region fields in their validate methods, so we should never really be synthesizing an invalid region config here, considering both the db desc and type desc are read from the store. I'm thinking of adding the validation check here as well, it's cheap enough and good for sanity.


pkg/sql/catalog/multiregion/region_config.go, line 86 at r2 (raw file):

Previously, ajwerner wrote…

Maybe in the one place, we could call validate?

Initially I had this as such because we wanted a context specific hint, but imo it's not worth the trouble so Im calling validate in that one place now.

@arulajmani arulajmani force-pushed the mr-de-dup-regions branch 2 times, most recently from a5e5a64 to 2af6b89 Compare March 17, 2021 19:41
Copy link
Contributor

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 19 of 20 files at r4.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @ajwerner, @arulajmani, and @otan)


pkg/sql/alter_database.go, line 344 at r4 (raw file):

	}

	// Ensure survivability goal and number of regions after the drop jive.

you can dance, you can jive having the time of your life
oooooh
see that girl
watch that scene
here comes arul ajmani

(what a funky way of saying "is valid", but i like it)


pkg/sql/region_util.go, line 895 at r4 (raw file):

	return multiregion.MakeRegionConfig(
		regionNames,
		dbDesc.RegionConfig.PrimaryRegion,

wonder if it's worth making a variant that takes in an descpb.Database_RegionConfig to avoid these four liners, but not important

Copy link
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!

bors r=otan

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @ajwerner, and @otan)


pkg/sql/alter_database.go, line 344 at r4 (raw file):

Previously, otan (Oliver Tan) wrote…

you can dance, you can jive having the time of your life
oooooh
see that girl
watch that scene
here comes arul ajmani

(what a funky way of saying "is valid", but i like it)

💯 💯


pkg/sql/region_util.go, line 895 at r4 (raw file):

Previously, otan (Oliver Tan) wrote…

wonder if it's worth making a variant that takes in an descpb.Database_RegionConfig to avoid these four liners, but not important

Ideally we'd want users to go through the SynthesizeRegionConfig method instead of constructing these on their own, mostly to ensure that the DatabaseRegionConfig/TypeDescriptor version are from the store (for DDLs).

@craig
Copy link
Contributor

craig bot commented Mar 17, 2021

Merge conflict.

Previously, we were duplicating regions both on the type descriptor and
the database descriptor. Theres a verbose explanation about why this is
bad on the linked issue, but the crux of the problem with this approach
is the added complexity it requires keeping these two lists in sync as
regions are added/dropped from a database in the face of possible
failure scenarios.

This patch removes the `Regions` field from the database descriptors's
region config. Instead, the regions stored on the multi-region enum
serve as the sole source of truth. To accomplish this, we introduce a
new `RegionConfig` struct which is synthesized from the state (primary
region, survival goal) stored on the database and the state (available
regions) stored on the type descriptor. All the places that constructed
zone configurations from the raw region config proto now use this new
struct. This patch also introduces a `InitializationRegionConfig`
struct which wraps a `RegionConfig` with initilization specific
meta-data.

I'm not too thrilled by the fact that survival goal and primary region
is still stored on the database descriptor. Ideally, the database
descriptor would contain just the multi-region enum ID and all user
configured parameters would be in one place (on the type descriptor's
region config). The level of indirection introduced by this patch sets
us on the path to that future.

Closes cockroachdb#60620

Release justification: low risk updates to new functionality
Release note: None
@arulajmani
Copy link
Collaborator Author

bors r+

@craig
Copy link
Contributor

craig bot commented Mar 17, 2021

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-21.1.x 21.1 is EOL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: stop duplicating list of regions on the database descriptor
6 participants