Skip to content

Conversation

@SangJunBak
Copy link
Contributor

@SangJunBak SangJunBak commented Nov 20, 2025

Motivation

I feel like these are pretty important instructions to have!

Tips for reviewer

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

@SangJunBak SangJunBak requested a review from a team as a code owner November 20, 2025 00:04
@SangJunBak SangJunBak requested review from jubrad and kay-kim November 20, 2025 00:04
Copy link
Contributor

@jubrad jubrad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still confused about this being in installation/_index.md rather than a specific upgrade overview file, but that doesn't need to be addressed in this PR.

Comment on lines 179 to 197
### Cancelling the Upgrade
To cancel an in-progress rollout and revert to the last completed rollout state, first retrieve the last rollout request ID from your Materialize CR. You can do this with the following command:

```shell
kubectl get materialize <instance-name> -n materialize-environment -o jsonpath='{.status.lastCompletedRolloutRequest}'
```

Next, get the previous `environmentdImageRef` from the `last-applied-configuration` annotation:
```shell
kubectl get materialize <instance-name> -n materialize-environment -o jsonpath='{.metadata.annotations.kubectl\.kubernetes\.io/last-applied-configuration}'
```
Finally, run the following command using the values obtained above:

```shell
kubectl patch materialize <instance-name> \
-n materialize-environment \
--type='merge' \
-p "{\"spec\": {\"requestRollout\": \"<lastCompletedRolloutRequest-value>\", \"environmentdImageRef\": \"<previous-environmentdImageRef>\" }}"
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I like the idea of having examples, but I also think we need to also make sure we're providing adequate context on the interfaces we're providing. People are going to deploy this through terraform, kustomize, pulumi, and who knows what else.

I would keep what we have here, but also talk about this a bit more in the context of the Materialize spec, when in the rollout one might want to use this (and why). I also am not sure about this process. We don't support downgrade so we need to make sure we're not rolling back a completed upgrade.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also am not sure about this process. We don't support downgrade so we need to make sure we're not rolling back a completed upgrade.

This makes sense to me! Given the rollout request goes back to the one for the last successful upgrade however, wouldn't it not try to initiate a new rollout?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also am not sure about this process.

These are the instructions @doy-materialize gave me too. But what's the correct process here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think i would probably just not try to restore environmentdImageRef or anything like that - there are a lot of different things that could have changed other than the version, and the specific changes don't really have anything to do with cancelling the rollout. i would just give instructions on how to restore the requestRollout field and leave it at that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if someone wanted to revert to the prior version the only way it would be safe is to first revert requestRollout then revert the environmentdImageRef, but it's probably fine to just show how to cancel without reverting this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not reverting environmentdImageRef wouldn't be unsafe - it won't try doing the upgrade again unless they explicitly trigger it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change is here 0283767. However I think I added this instruction because if you were to just reset the requestRollout field in upgrade-materialize.yaml but keep the environmentdImageRef you tried to update with, then it doesn't fully go back to the original state. Notably the balancerd pods don't get cleaned up:

Screenshot 2025-11-21 at 4 39 36 PM

But this might just be a bug w/ orchestratord

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, yeah, that's just a bug - i'm currently rewriting how balancerd gets deployed, i'll make sure to fix that as part of this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants