Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: add unique constraints to table descriptor #57502

Closed
wants to merge 1 commit into from

Conversation

rytaft
Copy link
Collaborator

@rytaft rytaft commented Dec 3, 2020

This commit adds unique constraints to the table descriptor. This is
currently only implemented for unique constraints that are enforced by
a unique index. Unique constraints wihout an index are still not supported.
However, this PR lays the groundwork so that UNIQUE WITHOUT INDEX can
be supported in future PRs.

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

This commit adds unique constraints to the table descriptor. This is
currently only implemented for unique constraints that are enforced by
a unique index. Unique constraints wihout an index are still not supported.
However, this PR lays the groundwork so that UNIQUE WITHOUT INDEX can
be supported in future PRs.

Release note: None
@jordanlewis
Copy link
Member

General question, without having reviewed the code: is there going to be a migration to upgrade old table descriptors to include unique constraints? Otherwise, how will we ensure that all future code can understand descriptors with and without explicit unique constraints listed?

@rytaft
Copy link
Collaborator Author

rytaft commented Dec 3, 2020

is there going to be a migration to upgrade old table descriptors to include unique constraints?

I'm not 100% sure that will be necessary. If I eventually remove the logic in sql/catalog/tabledesc/table.go where I added the comment // TODO(rytaft): remove this case once indexes are no longer used for unique constraints., then I will need to add a version check and/or migration code. But if I don't actually remove that logic then a migration might not be necessary....

Either way, I don't think it's necessary for this PR, but I would definitely welcome any advice for how to handle the migration/version issue.

EDIT: Thinking about this more, I think some type of migration logic will probably be necessary even if I don't remove that logic in sql/catalog/tabledesc/table.go. Is a cluster version the right way to go here?

@rytaft rytaft requested a review from a team December 3, 2020 22:50
@jordanlewis
Copy link
Member

What migration logic will be necessary? In other words, what new code will depend on the existence of a UniqueConstraint when there is one in the table descriptor?

For the foreign key representation upgrade, we handled the lack of a framework that could do a long-running migration over all descriptors (see @irfansharif's work #48843) by having all descriptors get upgraded on-demand upon read, in maybeFillInDescriptor. That way, theoretically, one could assert that all descriptors are at the newest version if there is a function that can take an old format descriptor to a new format descriptor.

But, I just want to give a heads up that we've had very serious issues in descriptor related code in the past due to this kind of work, because it's so easy to make changes that end up incompatible later unless there is a trusted and correct migration. We have scar tissue here recently from the foreign key issue seen in #57032 (and see the postmortem here: https://cockroachlabs.atlassian.net/wiki/spaces/SCHEMA/pages/1127219391/20.2+Foreign+Key+corruption+postmortem).

One of the follow up items from this postmortem was to ensure that we have better, automatic cross-version testing for descriptor format changes. But, we don't have this yet. All of this to say that I just want to make sure we have eyes on this, and unfortunately we may need to move a little more slowly on this upgrade than you would have hoped to ensure that we're confident in the resilience of this kind of migration.

We might want to schedule an in-person sync about this with the Schema team.

@rytaft
Copy link
Collaborator Author

rytaft commented Dec 3, 2020

What migration logic will be necessary? In other words, what new code will depend on the existence of a UniqueConstraint when there is one in the table descriptor?

There are two cases I see:

  1. If I remove the logic in sql/catalog/tabledesc/table.go, we must guarantee that all table descriptors are upgraded to include constraints for existing unique indexes (otherwise pg_catalog.pg_constraint will be incorrect). This change doesn't seem necessary in the near term (if ever).
  2. Either way, we'll need to prevent unique constraints from getting created if the cluster version is not at least 21.1. This is because we need to ensure that all nodes know how to drop or rename the unique constraints based on other changes to the table. So it seems like we can't avoid adding a new cluster version.

Scheduling a meeting sounds good. I can put something on the calendar for next week.

@rytaft
Copy link
Collaborator Author

rytaft commented Dec 4, 2020

I've thought about this more, and I think there are more downsides than benefits to this PR.

I thought it would make sense to store all unique constraints (both with and without an index) in the same slice for consistency and to improve Postgres compatibility, but I think it adds unnecessary complications:

  • As I noted above, in order for us to rely on the UniqueConstraints slice containing all unique constraints, we will need to perform a migration to generate unique constraints for existing tables with unique indexes.
  • Even if we don't rely on the UniqueConstraints slice (and continue to check the unique indexes as well), we still need to ensure that the constraints stay in sync with the indexes. As shown in this PR, this requires ensuring that renames affect both constraints and indexes, and any time the index is dropped, the constraint is dropped as well. I think there are also currently some test failures due to cases I missed (e.g., if the index ID can be changed, it must be changed in the constraint too).
  • This change is not actually necessary to improve compatibility with Postgres. If we want to support the Postgres distinction between constraints and indexes, we could add a boolean on the IndexDescriptor called HasUniqueConstraint. We can also always make the change proposed in this PR later on, since the UniqueConstraints slice will exist on the table descriptor for unique constraints without an index.

Given this, I will close this PR, revise it, and reopen with the following changes:

  • The UniqueConstraints slice on the table descriptor will only contain UNIQUE WITHOUT INDEX constraints.
  • Therefore, the UniqueConstraint type doesn't need to include the WithoutIndex or IndexID fields. We can add them back later if needed.
  • I will add a cluster version for UniqueWithoutIndexConstraints, and ensure that users cannot create unique constraints without an index unless all nodes have been upgraded to that version.

If anyone disagrees with this approach feel free to comment.

@rytaft rytaft closed this Dec 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants