Skip to content

Conversation

@fqazi
Copy link
Collaborator

@fqazi fqazi commented Feb 21, 2025

Previously, the declarative schema changer during ALTER PRIMARY KEY, the declarative schema changer would make the new primary index public and then work on creating replacement secondary indexes. This could lead to broken behaviour where you could be left with either unusable old secondary indexes (causing trouble for DML queries) or potential validation errors. To address this, this patch adjusts the rules so that the new primary index is kept at a validated state, while replacement secondary indexes are constructed. This guarantees that all secondary indexes are usable during a PK swap, and the new primary key and replacement secondary indexes are swapped in an atomic fashion.

Additionally, to help support this change the index backfiller has been updated, so that a source index can be specified for backfilling indexes. This allows the index backfiller to use a validated primary index to create the secondary index, without exposing it to read queries.

Fixes: #133129
Fixes: #130165

Release note (bug fix): Addressed a bug where secondary indexes could be
unusable by DML while a primary key swap was occurring, if the new
primary key did not contain columns from the old primary key.

@fqazi fqazi requested review from a team as code owners February 21, 2025 17:17
@fqazi fqazi requested review from jeffswenson and yuzefovich and removed request for a team February 21, 2025 17:17
@blathers-crl
Copy link

blathers-crl bot commented Feb 21, 2025

Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@github-actions
Copy link

github-actions bot commented Feb 21, 2025

⚪ Sysbench [SQL, 3node, oltp_read_write]
Metric Old Commit New Commit Delta Note Threshold
sec/op 10.87m ±1% 10.92m ±1% ~ p=0.063 n=10 3.0%
errs/op 0.000 ±0% 0.000 ±0% ~ p=1.000 n=10 0.0%
allocs/op 10.19k ±0% 10.20k ±1% ~ p=0.699 n=10 2.0%
B/op 2.207Mi ±0% 2.205Mi ±0% ~ p=0.529 n=10 2.0%
Reproduce

benchdiff binaries:

mkdir -p benchdiff/a696b72/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/a696b72247f3cce50e56f227a74ad3f09ec33c46/bin/pkg_sql_tests benchdiff/a696b72/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/a696b72/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/6bada5e/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/6bada5e7bc53457d56fc92acac8cac29127acbcb/bin/pkg_sql_tests benchdiff/6bada5e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/6bada5e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

benchdiff --run=^BenchmarkSysbench/SQL/3node/oltp_read_write$ --old=6bada5e --new=a696b72 ./pkg/sql/tests
🟢 Sysbench [KV, 1node, local, oltp_read_only]
Metric Old Commit New Commit Delta Note Threshold
🟢 sec/op 675.6µ ±2% 665.8µ ±1% -1.45% p=0.035 n=10 2.0%
errs/op 0.000 ±0% 0.000 ±0% ~ p=1.000 n=10 0.0%
allocs/op 439.0 ±0% 439.0 ±0% ~ p=1.000 n=10 1.5%
B/op 254.2Ki ±0% 254.2Ki ±0% ~ p=0.101 n=10 1.5%
Reproduce

benchdiff binaries:

mkdir -p benchdiff/a696b72/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/a696b72247f3cce50e56f227a74ad3f09ec33c46/bin/pkg_sql_tests benchdiff/a696b72/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/a696b72/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/6bada5e/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/6bada5e7bc53457d56fc92acac8cac29127acbcb/bin/pkg_sql_tests benchdiff/6bada5e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/6bada5e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

benchdiff --run=^BenchmarkSysbench/KV/1node_local/oltp_read_only$ --old=6bada5e --new=a696b72 ./pkg/sql/tests
⚪ Sysbench [KV, 1node, local, oltp_write_only]
Metric Old Commit New Commit Delta Note Threshold
sec/op 1.325m ±1% 1.329m ±1% ~ p=0.529 n=10 2.5%
errs/op 0.000 ±0% 0.000 ±0% ~ p=1.000 n=10 0.0%
allocs/op 1.392k ±0% 1.393k ±0% ~ p=0.315 n=10 1.8%
B/op 289.9Ki ±0% 290.1Ki ±0% ~ p=0.105 n=10 1.8%
Reproduce

benchdiff binaries:

mkdir -p benchdiff/a696b72/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/a696b72247f3cce50e56f227a74ad3f09ec33c46/bin/pkg_sql_tests benchdiff/a696b72/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/a696b72/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/6bada5e/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/6bada5e7bc53457d56fc92acac8cac29127acbcb/bin/pkg_sql_tests benchdiff/6bada5e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/6bada5e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

benchdiff --run=^BenchmarkSysbench/KV/1node_local/oltp_write_only$ --old=6bada5e --new=a696b72 ./pkg/sql/tests
Artifacts

download:

mkdir -p new
gcloud storage cp gs://cockroach-microbench-ci/artifacts/a696b72247f3cce50e56f227a74ad3f09ec33c46/13462950865-1/\* new/
mkdir -p old
gcloud storage cp gs://cockroach-microbench-ci/artifacts/6bada5e7bc53457d56fc92acac8cac29127acbcb/13462950865-1/\* old/
Legend
  • Neutral: No significant performance change.
  • 🟡 Warning: Slight degradation, likely due to variance, but still within thresholds.
  • 🔴 Regression: Likely performance regression, requiring investigation.
  • 🟢 Improvement: Possible performance gain.

No regressions detected!

built with commit: a696b72247f3cce50e56f227a74ad3f09ec33c46

@fqazi fqazi force-pushed the alterPrimaryKeyFix branch from dd769ad to a696b72 Compare February 21, 2025 18:28
@spilchen
Copy link
Contributor

I haven’t reviewed all the changes yet (I will), but one early comment is that a few tests were disabled that I believe can now be re-enabled. Search for 133129 in:

  • pkg/ccl/testccl/sqlccl/explain_test.go
  • pkg/sql/schemachanger/dml_injection_test.go

@yuzefovich yuzefovich removed their request for review February 24, 2025 23:33
@yuzefovich
Copy link
Member

Let me know if you want someone from Queries to take a look (on a quick glance that isn't needed). The only comment I have is that we should remove some skips that mention #133129 (TestExplainGist, TestAlterTableDMLInjection) and #130165 (operationGenerator.alterTableAlterPrimaryKey).

@fqazi fqazi force-pushed the alterPrimaryKeyFix branch from a696b72 to 40fe1c4 Compare February 25, 2025 13:49
@fqazi fqazi requested a review from a team as a code owner February 25, 2025 13:49
@fqazi fqazi requested review from golgeek and herkolategan and removed request for a team February 25, 2025 13:49
@fqazi
Copy link
Collaborator Author

fqazi commented Feb 25, 2025

@spilchen @yuzefovich Done, and thank you for tracking that down! Think we only need a foundations review for this one, since the backfiller changes are pretty trivial.

@fqazi fqazi force-pushed the alterPrimaryKeyFix branch 2 times, most recently from 2e48634 to f898328 Compare February 26, 2025 16:52
@fqazi fqazi force-pushed the alterPrimaryKeyFix branch 4 times, most recently from 766afaa to d0e6116 Compare March 11, 2025 13:12
@fqazi fqazi requested a review from a team as a code owner March 11, 2025 13:12
@fqazi fqazi requested review from andyyang890 and removed request for a team March 11, 2025 13:12
@fqazi
Copy link
Collaborator Author

fqazi commented Mar 11, 2025

@spilchen RFAL

@fqazi fqazi force-pushed the alterPrimaryKeyFix branch 2 times, most recently from 618d5bc to 0888a1e Compare March 19, 2025 14:22
Copy link
Contributor

@spilchen spilchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 127 of 127 files at r9, 126 of 126 files at r10, 13 of 13 files at r12, 17 of 17 files at r13, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andyyang890, @golgeek, @herkolategan, and @jeffswenson)


pkg/sql/schemachanger/scplan/internal/opgen/opgen_primary_index.go line 130 at r13 (raw file):

						}
						// Index was created as invisible, nothing needs to be done.
						if idx.IsNotVisible || idx.Invisibility == 1.0 {

We have already checked idx.IsNotVisible a few lines above. Was the intent here to check the visibility of the primary index? i.e. this.IsNotVisible || this.Invisibility == 1.0. If not, can we merge the if check.


pkg/sql/schemachanger/scpb/elements.proto line 346 at r13 (raw file):

  uint32 recreate_source_id = 4 [(gogoproto.customname) = "RecreateSourceIndexID", (gogoproto.casttype) = "github.com/cockroachdb/cockroach/pkg/sql/sem/catid.IndexID"];
  // If an index is being recreated, this is the target primary
  // index that will make it visible for usage.

Can we clarify that this is the ID of the new primary index? There is also SourceIndexID, which based on the discussion yesterday on slack (https://cockroachlabs.slack.com/archives/C04N0AS14CT/p1742314411019249), is the ID of the primary index too, but the old (?) one.

@fqazi fqazi force-pushed the alterPrimaryKeyFix branch 4 times, most recently from f4815c2 to 4d1aac8 Compare March 19, 2025 14:58
Copy link
Collaborator Author

@fqazi fqazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andyyang890, @golgeek, @herkolategan, @jeffswenson, and @spilchen)


pkg/sql/schemachanger/scpb/elements.proto line 346 at r13 (raw file):

Previously, spilchen wrote…

Can we clarify that this is the ID of the new primary index? There is also SourceIndexID, which based on the discussion yesterday on slack (https://cockroachlabs.slack.com/archives/C04N0AS14CT/p1742314411019249), is the ID of the primary index too, but the old (?) one.

Done.

I made it clear this is the final index that needs to be public:

// If an index is being recreated, this is the final primary
// index that will make it usable (i.e. the columns required
// to back this index will be public in this index).


pkg/sql/schemachanger/scplan/internal/opgen/opgen_primary_index.go line 130 at r13 (raw file):

Previously, spilchen wrote…

We have already checked idx.IsNotVisible a few lines above. Was the intent here to check the visibility of the primary index? i.e. this.IsNotVisible || this.Invisibility == 1.0. If not, can we merge the if check.

Done.

@fqazi fqazi requested a review from spilchen March 19, 2025 14:58
Copy link
Contributor

@spilchen spilchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 17 of 17 files at r14, 17 of 17 files at r15, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @andyyang890, @golgeek, @herkolategan, and @jeffswenson)


pkg/sql/schemachanger/scexec/scmutationexec/index.go line 520 at r15 (raw file):

	// If the primary index is already visible, then nothing
	// needs to be done. This can happen if we are able to sequence
	// publishing the primary and secondary indexes together.g

nit: spelling

Suggestion:

// publishing the primary and secondary indexes together.

pkg/sql/schemachanger/scpb/elements.proto line 347 at r15 (raw file):

  // If an index is being recreated, this is the final primary
  // index that will make it usable (i.e. the columns required
  // to back this index be will public in this index).

nit: suggest slight rewording

Suggestion:

// to back this index will be public in this index)

@fqazi fqazi force-pushed the alterPrimaryKeyFix branch from 4d1aac8 to 3fdccae Compare March 19, 2025 17:42
Previously, when secondary indexes were recreated during a primary index
change, they were made public before the new primary index was fully
public. This posed a risk: queries could potentially access the
secondary indexes before they were fully ready. To address this, this
patch makes secondary index replacements invisible until the new primary
index backing them is ready.

Release note: None
@fqazi fqazi force-pushed the alterPrimaryKeyFix branch from 3fdccae to 6cf5326 Compare March 19, 2025 18:13
@fqazi
Copy link
Collaborator Author

fqazi commented Mar 19, 2025

@spilchen TFTR!

bors r+

@craig
Copy link
Contributor

craig bot commented Mar 19, 2025

@craig craig bot merged commit 606cc0d into cockroachdb:master Mar 19, 2025
23 of 24 checks passed
fqazi added a commit to fqazi/cockroach that referenced this pull request Jun 3, 2025
Previously, before version 25.2, ALTER PRIMARY KEY operations in the
schema changer workload could fail due to duplicate column errors when
the declarative schema changer recreated secondary indexes. This
occurred because the declarative schema changer did not properly create
secondary indexes that were not subsets of each other. This issue was
fixed in cockroachdb#141850. To account for this, the patch modifies the schema
changer workload to anticipate index recreation errors as a potential
issue when running in a mixed-version state.

Fixes: cockroachdb#147259
Release note: None
fqazi added a commit to fqazi/cockroach that referenced this pull request Jun 3, 2025
Previously, before version 25.2, ALTER PRIMARY KEY operations in the
schema changer workload could fail due to duplicate column errors when
the declarative schema changer recreated secondary indexes. This
occurred because the declarative schema changer did not properly create
secondary indexes that were not subsets of each other. This issue was
fixed in cockroachdb#141850. To account for this, the patch modifies the schema
changer workload to anticipate index recreation errors as a potential
issue when running in a mixed-version state.

Fixes: cockroachdb#147259
Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sql/schemachanger: Secondary indexes and PK are inconsistent during ALTER PK schemachanger: properly clean up stored columns used in PK

4 participants