Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail restore if warmup fails only during check-wal-only strategy #5621

Merged

Conversation

michaelmdeng
Copy link
Contributor

What problem does this PR solve?

#5569 updates volume-snapshot restore process to fail the entire restore if any warmup job failed. We use this to quickly check the viability of restores and terminate restore processing early if a corruption is detected.

#5572 updates volume-snapshot restore process to enable recovery from a corruption in a single TiKV through manual cluster operations. We use this in a full restore in case we encounter a corruption during this process.

These features are in conflict w/ each other. If we want to perform a full restore and use single TiKV recovery in the event of corruption, we cannot fail the restore during warmup and instead need to complete warmup stage and progress to restarting TiKVs. If we only want to check the viability of a restore, we are ok w/ failing the restore and not progressing to any further steps. Thus, we gate this failure behavior only behind the check-wal-only strategy.

What is changed and how does it work?

Gate restore failure on warmup failure only for check-wal-only warmup strategy.

Code changes

  • Has Go code change
  • Has CI related scripts change

Tests

  • Unit test
  • E2E test
  • Manual test
  • No code

Side effects

  • Breaking backward compatibility
  • Other side effects:

Related changes

  • Need to cherry-pick to the release branch
  • Need to update the documentation

Release Notes

Please refer to Release Notes Language Style Guide before writing the release note.


@ti-chi-bot ti-chi-bot bot requested a review from howardlau1999 April 16, 2024 22:06
@sre-bot
Copy link
Contributor

sre-bot commented Apr 16, 2024

CLA assistant check
All committers have signed the CLA.

Copy link
Contributor

@YuJuncen YuJuncen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM

pkg/backup/restore/restore_manager_test.go Outdated Show resolved Hide resolved
pkg/backup/restore/restore_manager_test.go Outdated Show resolved Hide resolved
Copy link
Contributor

ti-chi-bot bot commented Apr 24, 2024

@YuJuncen: adding LGTM is restricted to approvers and reviewers in OWNERS files.

In response to this:

Rest LGTM

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@BornChanger
Copy link
Contributor

/test-pull-e2e-kind-br

michaelmdeng and others added 3 commits April 24, 2024 12:59
Co-authored-by: 山岚 <36239017+YuJuncen@users.noreply.github.com>
Co-authored-by: 山岚 <36239017+YuJuncen@users.noreply.github.com>
Copy link
Contributor

ti-chi-bot bot commented Apr 25, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: BornChanger, YuJuncen

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot removed the lgtm label Apr 25, 2024
Copy link
Contributor

ti-chi-bot bot commented Apr 25, 2024

[LGTM Timeline notifier]

Timeline:

  • 2024-04-25 10:08:11.994056981 +0000 UTC m=+254848.733959894: ☑️ agreed by BornChanger.
  • 2024-04-25 10:34:27.294165343 +0000 UTC m=+256424.034068255: ✖️🔁 reset by ti-chi-bot[bot].

Copy link
Contributor

ti-chi-bot bot commented Apr 25, 2024

New changes are detected. LGTM label has been removed.

@csuzhangxc csuzhangxc merged commit 5fc0f19 into pingcap:master Apr 25, 2024
5 of 6 checks passed
@csuzhangxc
Copy link
Member

/cherry-pick release-1.5

@ti-chi-bot
Copy link
Member

@csuzhangxc: new pull request created to branch release-1.5: #5636.

In response to this:

/cherry-pick release-1.5

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

csuzhangxc pushed a commit that referenced this pull request Apr 25, 2024
…5621) (#5636)

Co-authored-by: Michael Deng <michaelmdeng@gmail.com>
Co-authored-by: Michael Deng <33045922+michaelmdeng@users.noreply.github.com>
Co-authored-by: 山岚 <36239017+YuJuncen@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants