-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server,upgrades: bootstrap at binaryMinVersion and run upgrades to BinaryVersionOverride #99082
server,upgrades: bootstrap at binaryMinVersion and run upgrades to BinaryVersionOverride #99082
Conversation
97483fb
to
e7bd943
Compare
e7bd943
to
c4db729
Compare
For context, I need some version of this patch to be able to drop columns from the |
@@ -125,7 +125,7 @@ CREATE SEQUENCE system.role_id_seq START 100 MINVALUE 100 MAXVALUE 2147483647;` | |||
serviceLatencyComputeExpr = `(((statistics->'statistics':::STRING)->'svcLat':::STRING)->'mean':::STRING)::FLOAT8` | |||
cpuSqlNanosComputeExpr = `(((statistics->'execution_statistics':::STRING)->'cpuSQLNanos':::STRING)->'mean':::STRING)::FLOAT8` | |||
contentionTimeComputeExpr = `(((statistics->'execution_statistics':::STRING)->'contentionTime':::STRING)->'mean':::STRING)::FLOAT8` | |||
totalEstimatedExecutionTimeExpr = `((statistics->'statistics':::STRING)->'cnt':::STRING)::FLOAT8 * (((statistics->'statistics':::STRING)->'svcLat':::STRING)->>'mean':::STRING)::FLOAT8` | |||
totalEstimatedExecutionTimeExpr = `((statistics->'statistics':::STRING)->>'cnt':::STRING)::FLOAT8 * (((statistics->'statistics':::STRING)->'svcLat':::STRING)->>'mean':::STRING)::FLOAT8` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc: @ericharmeling this is a small diff that was caught now that we actually run upgrades during tests that override the binary version. This expression deferred slightly from the expression used in the migration that adds this column.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for fixing this. I should've caught this when adding the migration ( 🤦 ), but it makes sense that the upgrade tests didn't at the time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent an hour trying to page this back in, including talking to Aditya offline. We wrote up https://cockroachlabs.slack.com/files/UHXRNP9D1/F0503KK3VFE/unit_testing_gap_with_upgrades for @cockroachdb/test-eng consideration. I'm going to recuse myself from the code reviews (it looks ok to me, but the audit of case2 tests seems worth doing).
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt, @smg260, and @srosenberg)
// ------ | ||
// If the test has not overridden the | ||
// `cluster.Settings.Version.BinaryMinSupportedVersion`, then the cluster will | ||
// be bootstrapped at `binaryMinSupportedVersion` (if this server is the one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we'd insist the test specify the bootstrap version if it specifies a binary version override, rather than just picking binaryMin. I get that this would mean changing every test right now but maybe leave a TODO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, added a TODO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall this sounds good to me -- tests that think they're testing against an older version should look more like that older version, including its bootstrapped state, not new state. I have a couple nits in how we decide what state to bootstrap (I'd demand a test pick instead of assuming min -- which can change -- is what they want), but those don't need to block this since as-is it is already much more correct than the previous state of mixed-version tests being mostly ineffective due to flawed setup
c4db729
to
5d99c57
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me, thanks for fixing this!
If I understand it, this behavior is not new and has existed for multiple releases. Have we seen differences in the schema of a cluster bootstrapped at a newer version vs one that was migrated from an older version (like what was fixed in this PR)?
If not, were we just "lucky" before, or are there other tests that would've caught such differences?
…naryVersionOverride Previously, tests that set BinaryVersionOverride would bootstrap their test clusters at this overridden value. As part of this bootstrap we would initialize the cluster with data as it appears on `binaryVersion` i.e. system tables would be created using the schemas defined on the current `binaryVersion`. Most tests that set BinaryVersionOverride would then manually call into `upgrade.Upgrade` to move the cluster version forward. Since we have already bootstrapped the initial data at `binaryVersion` these upgrades would not *really* run. We would just be testing their idempotency unless the test did some gymnastics to "undo" the bootstrap and then test the actual migration. So effectively, the `BinaryVersionOverride` made the server think it was running that version but none of the associated upgrade logic to reach that version was exercised. In 23.1 we baked in the initial data that is bootstrapped on `binaryMinSupportedVersion`. This development allows us to bootstrap the test cluster to `binaryMinSupportedVersion` and *actually* step through the upgrades until `BinaryVersionOverride` before running the test. The new behaviour of setting this override is described below: This value, when set, influences test cluster/server creation in two different ways: Case 1: ------ If the test has not overridden the `cluster.Settings.Version.BinaryMinSupportedVersion`, then the cluster will be bootstrapped at `binaryMinSupportedVersion` (if this server is the one bootstrapping the cluster). After all the servers in the test cluster have been started, `SET CLUSTER SETTING version = BinaryVersionOverride` will be run to step through the upgrades until the specified override. Case 2: ------ If the test has overridden the `cluster.Settings.Version.BinaryMinSupportedVersion` then it is not safe for us to bootstrap at `binaryMinSupportedVersion` as it might be less than the overridden minimum supported version. Furthermore, we do not have the initial cluster data (system tables etc.) to bootstrap at the overridden minimum supported version. In this case we bootstrap at `BinaryVersionOverride` and populate the cluster with initial data corresponding to the `binaryVersion`. In other words no upgrades are *really* run and the server only thinks that it is running at `BinaryVersionOverride`. Tests that fall in this category should be audited for correctness. The version that we bootstrap at is also used when advertising this server's binary version when sending out join requests. Informs: cockroachdb#97762 Release note: None
5d99c57
to
1d1ac7a
Compare
yup that is correct
as you pointed out online I think we were lucky before. Going forward however |
TFTR! bors r=dt,renatolabs |
Build succeeded: |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. Backport to branch 23.1.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Previously, tests that set BinaryVersionOverride would bootstrap
their test clusters at this overridden value. As part of this bootstrap
we would initialize the cluster with data as it appears on
binaryVersion
i.e. system tables would be created using the schemas defined on the current
binaryVersion
.Most tests that set BinaryVersionOverride would then manually call into
upgrade.Upgrade
to move the cluster version forward. Since we have already bootstrapped the initial
data at
binaryVersion
these upgrades would not really run. We would just be testingtheir idempotency unless the test did some gymnastics to "undo" the bootstrap and
then test the actual migration. So effectively, the
BinaryVersionOverride
madethe server think it was running that version but none of the associated upgrade
logic to reach that version was exercised.
In 23.1 we baked in the initial data that is bootstrapped on
binaryMinSupportedVersion
.This development allows us to bootstrap the test cluster to
binaryMinSupportedVersion
and actually step through the upgrades until
BinaryVersionOverride
before running thetest.
The new behaviour of setting this override is described below:
This value, when set, influences test cluster/server creation in two
different ways:
Informs: #97762
Release note: None