Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: add syntax to create unique constraints without an index #55700

Merged
merged 1 commit into from
Oct 21, 2020

Conversation

rytaft
Copy link
Collaborator

@rytaft rytaft commented Oct 19, 2020

This commit adds support for the syntax ... UNIQUE WITHOUT INDEX ...,
both when adding UNIQUE constraints and when adding UNIQUE columns. Using
this syntax will currently return the error "unique constraints without
an index are not yet supported", but support for the syntax serves as a
starting point for adding support for these unique constraints.

Informs #41535

Release note (sql change): Added support for using the syntax
... UNIQUE WITHOUT INDEX ... in CREATE TABLE and ALTER TABLE statements,
both when defining columns and unique constraints. Although this syntax
can now be parsed successfully, using this syntax currently returns an
error "unique constraints without an index are not yet supported".

@rytaft rytaft requested a review from a team as a code owner October 19, 2020 17:07
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parser code changes LGTM

(did not review whether it matches the desired spec)

i'd say @otan is maybe the better person to review the change holistically.

@rytaft rytaft requested a review from otan October 19, 2020 17:43
@rytaft
Copy link
Collaborator Author

rytaft commented Oct 19, 2020

TFTR! Added @otan as a reviewer.

@RaduBerinde
Copy link
Member

One alternative that I want to run by you is specifying something like UNIQUE INDEX (a) PARTITIONED BY (region), which would create unique index (region, a) to be used for the uniqueness checks. That would make it harder to misuse (creating uniqueness constraints without any useful index).

Maybe PARTITIONED BY is not the right word, as there's nothing specific about regions, other ideas are SHARDED BY, PREFIXED BY

@mgartner
Copy link
Collaborator

I'm a +1 for eliminating the possibility for users to introduce full table scans for simple INSERTs: #41535 (comment)

@rytaft
Copy link
Collaborator Author

rytaft commented Oct 19, 2020

My understanding is that we want customers to be able to move from a single-region deployment to a multi-region deployment without changing their schema, and everything should "just work". So UNIQUE INDEX (a) would effectively become syntactic sugar for something like UNIQUE INDEX (a) PARTITIONED BY (region) (and we don't necessarily even need to support the PARTITIONED BY syntax). I'm not opposed to adding that syntax in case power users want to specify a column other than region to use for partitioning -- but that would be a separate PR.

The syntax in this PR is unrelated to the new Multi-Region abstractions, and will allow us to separate the concept of unique constraints from unique indexes. I don't think it will be easy to misuse, because users will need to knowingly write WITHOUT INDEX in their schema. I also don't expect it to be a commonly used feature other than for advanced users and for testing purposes.

By separating the concept of unique constraints from unique indexes, we'll be able to add a unique check by default to any mutation that touches a unique column or set of columns, and as an optimization, remove it if there exists a unique index with those columns. Otherwise, the check itself can be optimized by taking advantage of any indexes available (such as a partitioned unique index). This has the advantage of keeping the uniqueness checks general and not tied to the concept of regions.

Copy link
Contributor

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @otan, and @rytaft)


docs/generated/sql/bnf/col_qualification.bnf, line 13 at r1 (raw file):

	| 'NOT' 'NULL'
	| 'NULL'
	| 'UNIQUE' opt_without_index

i'm reading https://medium.com/flatiron-engineering/uniqueness-in-postgresql-constraints-versus-indexes-4cf957a472fd and it seems as though indexes and constraints for UNIQUE have different properties.

it sounds like for this case, this should be a constraint, not an index, and so only a constraint CONSTRAINT ... WITHOUT INDEX syntax is the thing we want.

is this something we care about?

@rytaft
Copy link
Collaborator Author

rytaft commented Oct 19, 2020


docs/generated/sql/bnf/col_qualification.bnf, line 13 at r1 (raw file):

Previously, otan (Oliver Tan) wrote…

i'm reading https://medium.com/flatiron-engineering/uniqueness-in-postgresql-constraints-versus-indexes-4cf957a472fd and it seems as though indexes and constraints for UNIQUE have different properties.

it sounds like for this case, this should be a constraint, not an index, and so only a constraint CONSTRAINT ... WITHOUT INDEX syntax is the thing we want.

is this something we care about?

My understanding from reading that article is that UNIQUE constraints and UNIQUE indexes in Postgres are in fact identical due to the implementation of UNIQUE constraints using an index. I know it says "the preferred method is to use constraints", but that seems to be purely for academic reasons.

But either way, it's not clear to me that there is a difference between CONSTRAINT foo UNIQUE (a) and UNIQUE (a) other than the fact that the first one has a name. I don't think the second one necessarily implies the presence of an index any more than the first one does. Am I missing something?

Copy link
Contributor

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 15 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball and @rytaft)


docs/generated/sql/bnf/col_qualification.bnf, line 13 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

My understanding from reading that article is that UNIQUE constraints and UNIQUE indexes in Postgres are in fact identical due to the implementation of UNIQUE constraints using an index. I know it says "the preferred method is to use constraints", but that seems to be purely for academic reasons.

But either way, it's not clear to me that there is a difference between CONSTRAINT foo UNIQUE (a) and UNIQUE (a) other than the fact that the first one has a name. I don't think the second one necessarily implies the presence of an index any more than the first one does. Am I missing something?

Taking a read of https://stackoverflow.com/questions/23542794/postgres-unique-constraint-vs-index

It seems like UNIQUE CONSTRAINT doesn't actually need to be backed by an index -- just that all the databases decided to implemented that way.

From what I can tell online, there are things CONSTRAINTS cannot do that indexes can or vice versa, e.g. DEFERRED (constraints only), CONCURRENTLY (index only), ON CONFLICT (constraints only).

The difference seems subtle but I would not be surprised if people relied on it (and not surprised if we treat them the same when we shouldn't).

Taking a re-read of this though, seems like UNIQUE in this clause makes it a constraint so it doesn't really matter, my bad.

@otan
Copy link
Contributor

otan commented Oct 19, 2020


docs/generated/sql/bnf/col_qualification.bnf, line 13 at r1 (raw file):

Previously, otan (Oliver Tan) wrote…

Taking a read of https://stackoverflow.com/questions/23542794/postgres-unique-constraint-vs-index

It seems like UNIQUE CONSTRAINT doesn't actually need to be backed by an index -- just that all the databases decided to implemented that way.

From what I can tell online, there are things CONSTRAINTS cannot do that indexes can or vice versa, e.g. DEFERRED (constraints only), CONCURRENTLY (index only), ON CONFLICT (constraints only).

The difference seems subtle but I would not be surprised if people relied on it (and not surprised if we treat them the same when we shouldn't).

Taking a re-read of this though, seems like UNIQUE in this clause makes it a constraint so it doesn't really matter, my bad.

(there's also some fun differences involving NULLS on constraints vs indexes too, but I'm going to stop reading there because my head hurts)

@rytaft
Copy link
Collaborator Author

rytaft commented Oct 19, 2020


docs/generated/sql/bnf/col_qualification.bnf, line 13 at r1 (raw file):

Previously, otan (Oliver Tan) wrote…

(there's also some fun differences involving NULLS on constraints vs indexes too, but I'm going to stop reading there because my head hurts)

Haha sounds good -- thanks for checking this out. I don't think we currently treat the two concepts differently in CRDB, but that's good to know that there are differences in other DBs.

@mgartner
Copy link
Collaborator

Thanks for the additional context @rytaft. LGTM!

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @andy-kimball, @otan, and @rytaft)


pkg/sql/sem/tree/create.go, line 372 at r1 (raw file):

		ShardBuckets Expr
	}
	Unique               bool

[nit] maybe group these into a Unique struct like the others. Also maybe WithoutIndex instead of NoIndex would make it easier for people to discover the corresponding syntax.

This commit adds support for the syntax `... UNIQUE WITHOUT INDEX ...`,
both when adding UNIQUE constraints and when adding UNIQUE columns. Using
this syntax will currently return the error "unique constraints without
an index are not yet supported", but support for the syntax serves as a
starting point for adding support for these unique constraints.

Informs cockroachdb#41535

Release note (sql change): Added support for using the syntax
`... UNIQUE WITHOUT INDEX ...` in CREATE TABLE and ALTER TABLE statements,
both when defining columns and unique constraints. Although this syntax
can now be parsed successfully, using this syntax currently returns an
error "unique constraints without an index are not yet supported".
Copy link
Collaborator Author

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andy-kimball and @otan)


pkg/sql/sem/tree/create.go, line 372 at r1 (raw file):

Previously, RaduBerinde wrote…

[nit] maybe group these into a Unique struct like the others. Also maybe WithoutIndex instead of NoIndex would make it easier for people to discover the corresponding syntax.

Done.

@awoods187
Copy link
Contributor

I like the proposed syntax and think we may not need Radus suggestion but its good to have in reserve if we need it.

Copy link
Collaborator Author

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andy-kimball and @otan)

@craig
Copy link
Contributor

craig bot commented Oct 21, 2020

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants