Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix regression in allowed foreign keys on distributed tables #6550

Merged
merged 2 commits into from
Jan 24, 2023
Merged

Conversation

JelteF
Copy link
Contributor

@JelteF JelteF commented Dec 7, 2022

DESCRIPTION: Fixes a regression in allowed foreign keys on distributed tables

In commit eadc88a we changed how we skip foreign key validation. The
goal was to skip it in more cases. However, one change had the
unintended regression of introducing failures when trying to create
certain foreign keys. This reverts that part of the change.

The way of skipping validation of foreign keys that was introduced in eadc88a was skipping
validation during execution. The reason that this caused this regression
was because some foreign key validation queries already fail during
planning. In those cases it never gets to the execution step where it
would later be skipped.

Fixes #6543

@JelteF JelteF changed the title Fix regression in foreign keys on distributed tables Fix regression in allowed foreign keys on distributed tables Dec 7, 2022
@JelteF JelteF requested a review from emelsimsek December 7, 2022 16:38
@codecov
Copy link

codecov bot commented Dec 7, 2022

Codecov Report

Merging #6550 (a9321b6) into main (58368b7) will decrease coverage by 0.14%.
The diff coverage is 83.33%.

❗ Current head a9321b6 differs from pull request most recent head 132c3df. Consider uploading reports for the commit 132c3df to get more accurate results

@@            Coverage Diff             @@
##             main    #6550      +/-   ##
==========================================
- Coverage   93.02%   92.89%   -0.14%     
==========================================
  Files         259      259              
  Lines       55859    55475     -384     
==========================================
- Hits        51965    51532     -433     
- Misses       3894     3943      +49     

Copy link
Contributor

@emelsimsek emelsimsek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the sake of simplicity, it looks like we could remove more code in addition to auto_explain workaround.

We signal postgres not to do any validation by setting

Constraint::skip_validation to true

causing not any select queries triggered by postgres.

This is intented to be done in two situations:

  1. The user requests it explicitly by setting GUC skip_constraint_validation to on which sets SkipConstraintValidation = true internally.
  2. Citus decides to skip validation implicitly in SkipForeignKeyValidationIfConstraintIsFkey function in various places.

Can we remove src/backend/distributed/executor/multi_executor.c:203:
if (AlterTableConstraintCheck(queryDesc))
block in the citus executor given that no unintended validation queries should end up here anymore since they are not planned in the first place?

if (skip_validation && IsA(parseTree, AlterTableStmt))
{
EnableSkippingConstraintValidation();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EnableSkippingConstraintValidation is not used anywhere anymore. Can we remove it?

@marcocitus
Copy link
Member

Anything blocking this PR?

In commit eadc88a we changed how we skip foreign key validation. The
goal was to skip it in more cases. However, one change had the
unintended regression of introducing failures when trying to create
certain foreign keys. This reverts that part of the change.

The new way of skipping validation of foreign keys skipped the
validation during execution. The reason that this caused this regression
was because some foreign key validation queries already fail during
planning. In those cases it never gets to the execution step where it
would later be skipped.

Fixes #6543
@JelteF JelteF force-pushed the fix-6543 branch 2 times, most recently from 05a505e to 3cb9d66 Compare January 24, 2023 10:39
@JelteF JelteF enabled auto-merge (squash) January 24, 2023 10:59
@JelteF
Copy link
Contributor Author

JelteF commented Jan 24, 2023

@emelsimsek I think that cleanup might work, but I'd have to check the code better to see if we're not missing anything. Based on the comment in multi_executor.c it suggests there are certain cases where this is not enough. In any case such a refactor is probably to risky to backport (without good reason). I think we should merge this PR and then we can work on top of it.

@emelsimsek
Copy link
Contributor

Makes sense.

I am curious if we are clear on what this GUC provides on top of existing postgres ways of deferring validation checks. (https://www.dbi-services.com/blog/enabling-disabling-and-validating-foreign-key-constraints-in-postgresql/)

It looks like we want to default to skipping for empty shell tables always and do the validation on the shards at workers without the client having to set this GUC.

@JelteF JelteF requested a review from emelsimsek January 24, 2023 12:46
@JelteF
Copy link
Contributor Author

JelteF commented Jan 24, 2023

I am curious if we are clear on what this GUC provides on top of existing postgres ways of deferring validation checks.

The main reason is we want to actually not do the check at all to reduce the spend reduce CPU and IO time. AS INVALID simply postpones the check for later when less heavy locks are held, but it still needs to actually check that the constraint holds. Our skipping completely skips the check, because we know that it will be valid anyway.

We used AS INVALID together with some manual catalog changes before 11.1 for shard moves, but that had the problem that AS INVALID is not supported on partitioned tables.

@JelteF
Copy link
Contributor Author

JelteF commented Jan 24, 2023

I think that cleanup might work

I just checked, and it did not: https://github.com/citusdata/citus/tree/cleanup-constraint-validation

It seems we'd need more logic in the utility hook to skip foreign keys in certain cases, at least related to partitioning (as the comment suggests is the reason for the constraint check in the executor function).

JelteF added a commit that referenced this pull request Jan 24, 2023
DESCRIPTION: Fix regression in allowed foreign keys on distributed
tables

In commit eadc88a we changed how we skip foreign key validation. The
goal was to skip it in more cases. However, one change had the
unintended regression of introducing failures when trying to create
certain foreign keys. This reverts that part of the change.

The way of skipping validation of foreign keys that was introduced in
eadc88a was skipping validation during execution. The reason that
this caused this regression was because some foreign key validation
queries already fail during planning. In those cases it never gets to
the execution step where it would later be skipped.

Fixes #6543
@JelteF JelteF disabled auto-merge January 24, 2023 13:18
@JelteF JelteF enabled auto-merge (squash) January 24, 2023 13:19
@JelteF JelteF merged commit d21ff0f into main Jan 24, 2023
@JelteF JelteF deleted the fix-6543 branch January 24, 2023 13:26
JelteF added a commit that referenced this pull request Jan 27, 2023
DESCRIPTION: Fix foreign key validation skip at the end of shard move

In eadc88a we started completely skipping foreign key constraint
validation at the end of a non blocking shard move, instead of only for
foreign keys to reference tables. However, it turns out that this didn't
work at all because of a hard to notice bug: By resetting the
SkipConstraintValidation flag at the end of our utility hook, we
actually make the SET command that sets it a no-op.

This fixes that bug by removing the code that resets it. This is fine
because #6543 removed the only place where we set the flag in C code. So
the resetting of the flag has no purpose anymore. This PR also adds a
regression test, because it turned out we didn't have any otherwise we
would have caught that the feature was completely broken.

It also moves the constraint validation skipping to the utility hook.
The reason is that #6550 showed us that this is the better place to skip
it, because it will also skip the planning phase and not just the
execution.
JelteF added a commit that referenced this pull request Feb 10, 2023
DESCRIPTION: Fix regression in allowed foreign keys on distributed
tables

In commit eadc88a we changed how we skip foreign key validation. The
goal was to skip it in more cases. However, one change had the
unintended regression of introducing failures when trying to create
certain foreign keys. This reverts that part of the change.

The way of skipping validation of foreign keys that was introduced in
eadc88a was skipping validation during execution. The reason that
this caused this regression was because some foreign key validation
queries already fail during planning. In those cases it never gets to
the execution step where it would later be skipped.

Fixes #6543

(cherry picked from commit 7a7880a)
JelteF added a commit that referenced this pull request Feb 10, 2023
DESCRIPTION: Fix foreign key validation skip at the end of shard move

In eadc88a we started completely skipping foreign key constraint
validation at the end of a non blocking shard move, instead of only for
foreign keys to reference tables. However, it turns out that this didn't
work at all because of a hard to notice bug: By resetting the
SkipConstraintValidation flag at the end of our utility hook, we
actually make the SET command that sets it a no-op.

This fixes that bug by removing the code that resets it. This is fine
because #6543 removed the only place where we set the flag in C code. So
the resetting of the flag has no purpose anymore. This PR also adds a
regression test, because it turned out we didn't have any otherwise we
would have caught that the feature was completely broken.

It also moves the constraint validation skipping to the utility hook.
The reason is that #6550 showed us that this is the better place to skip
it, because it will also skip the planning phase and not just the
execution.

(cherry picked from commit 81dcddd)
rajeshkt78 pushed a commit that referenced this pull request Feb 17, 2023
DESCRIPTION: Fix regression in allowed foreign keys on distributed
tables

In commit eadc88a we changed how we skip foreign key validation. The
goal was to skip it in more cases. However, one change had the
unintended regression of introducing failures when trying to create
certain foreign keys. This reverts that part of the change.

The way of skipping validation of foreign keys that was introduced in
eadc88a was skipping validation during execution. The reason that
this caused this regression was because some foreign key validation
queries already fail during planning. In those cases it never gets to
the execution step where it would later be skipped.

Fixes #6543
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Distribution key as varchar causing problems
4 participants