Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhances the decision to take full snapshot during startup to avoid missing of any full-snapshot. #574

Merged

Conversation

ishan16696
Copy link
Member

What this PR does / why we need it:
This PR enhances the condition of checking whether to take a full snapshot or not during startup of backup-restore to avoid missing of any full-snapshot within 24hrs of window.

Which issue(s) this PR fixes:
Fixes #570

Special notes for your reviewer:

Release note:

Enhances the decision to take full snapshot during startup of etcd-backup-restore to avoid missing of any full-snapshot.

@ishan16696 ishan16696 requested a review from a team as a code owner January 12, 2023 09:53
@gardener-robot gardener-robot added needs/review Needs review size/s Size of pull request is small (see gardener-robot robot/bots/size.py) labels Jan 12, 2023
@gardener-robot-ci-3 gardener-robot-ci-3 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jan 12, 2023
@ishan16696
Copy link
Member Author

This PR is ready for first review. I will add unit tests.
Hence /do-not-merge

@ishan16696 ishan16696 added the reviewed/do-not-merge Has no approval for merging as it may break things, be of poor quality or have (ext.) dependencies label Jan 12, 2023
@ishan16696 ishan16696 force-pushed the enhance/startup-fullSnapCondition branch from 9c42397 to d480f23 Compare January 12, 2023 09:59
@gardener-robot-ci-3 gardener-robot-ci-3 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jan 12, 2023
@gardener-robot gardener-robot added size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) and removed size/s Size of pull request is small (see gardener-robot robot/bots/size.py) labels Jan 13, 2023
@gardener-robot-ci-2 gardener-robot-ci-2 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jan 13, 2023
@ishan16696
Copy link
Member Author

I have added the unit tests.

@ishan16696 ishan16696 removed the reviewed/do-not-merge Has no approval for merging as it may break things, be of poor quality or have (ext.) dependencies label Jan 13, 2023
@ishan16696 ishan16696 force-pushed the enhance/startup-fullSnapCondition branch from f98c0ab to ce638ea Compare January 13, 2023 12:41
@gardener-robot-ci-1 gardener-robot-ci-1 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jan 13, 2023
@gardener-robot-ci-3 gardener-robot-ci-3 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jan 13, 2023
@ishan16696 ishan16696 force-pushed the enhance/startup-fullSnapCondition branch from 132191c to e843ff2 Compare January 13, 2023 17:19
@gardener-robot-ci-3 gardener-robot-ci-3 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jan 13, 2023
@gardener-robot-ci-2 gardener-robot-ci-2 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jan 13, 2023
Copy link
Collaborator

@shreyas-s-rao shreyas-s-rao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ishan16696 thanks for your PR. I feel the code still assumes that full snapshot is supposed to be taken every 24 hours, irrespective of the cron schedule, and I feel it needs better parsing of the cron schedule (when was the last scheduled cron timestamp, when is the next scheduled cron timestamp). You can probably use something like https://github.com/aptible/supercronic/tree/master/cronexpr to achieve this (which provides Next and NextN methods, but not Previous method so you might have to somehow derive that from existing methods).

Thanks for adding the tests though.

@@ -520,3 +520,17 @@ func ParsePeerURL(initialAdvertisePeerURLs, podName string) (string, error) {
domaiName := fmt.Sprintf("%s.%s.%s", tokens[1], tokens[2], "svc")
return fmt.Sprintf("%s://%s.%s:%s", tokens[0], podName, domaiName, tokens[3]), nil
}

// GetPrevDayScheduledSnapTime returns the previous day schedule snapshot time.
func GetPrevDayScheduledSnapTime(nextSnapSchedule time.Time) time.Time {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes that there was a full snapshot that was taken during the previous day. What if user sets full snapshot schedule for once in 3 days or once a week? This wouldn't apply then.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if user sets full snapshot schedule for once in 3 days or once a week?

user can't set the full snapshot schedule ... etcd-druid passed a/c to maintenance window right?
Apart from this even if that's the case when full snapshot schedule for once in 3 days and backup-restore get restart then we will take full snapshot as it crosses the 24 hr of limit, control won't reach GetPrevDayScheduledSnapTime.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Shreyas is pointing to the name of this method which assumes that the schedule is once daily. Once you allow user to specify a schedule you cannot assume that it is once per day and then name of the method as per that assumption.

Copy link
Contributor

@unmarshall unmarshall Jan 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Therefore even the computation done in the method is based on that assumption which is incorrect. It would have been really nice if robfig/cron#224 was taken up for implementation. But it is not there as of now.

@@ -42,6 +42,10 @@ import (
"sigs.k8s.io/controller-runtime/pkg/client"
)

const (
recentFullSnapshotPeriodInHours = 23.5
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still hard-coded? Didn't we want to make this dynamic based on full snapshot cron schedule?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is already a schedule present, right. Is it possible to derive this from the schedule instead?

pkg/snapshot/snapshotter/snapshotter_test.go Outdated Show resolved Hide resolved
pkg/snapshot/snapshotter/snapshotter_test.go Outdated Show resolved Hide resolved
pkg/snapshot/snapshotter/snapshotter.go Outdated Show resolved Hide resolved
@gardener-robot gardener-robot added the needs/changes Needs (more) changes label Jan 16, 2023
@ishan16696
Copy link
Member Author

I feel the code still assumes that full snapshot is supposed to be taken every 24 hours,

yes, it does ... ok I will try to make it more dynamic.

@gardener-robot-ci-3 gardener-robot-ci-3 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Feb 9, 2023
Copy link
Collaborator

@shreyas-s-rao shreyas-s-rao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ishan16696 thanks for making the calculation of previous scheduled full snapshot more generic. With this approach of using a timeWindow to determine both previous as well as next full snapshot, etcdbr now assumes that the full snapshot schedule only generates events at fixed (regular) intervals. But cron schedule still allows end-users to define schedules such as 0 9 * * 1-3, which translate to At 09:00 on every day-of-week from Monday through Wednesday - this essentially means that the timeWindow between the events 9am on Tuesday and 9am on Wednesday is not the same as the timeWindow between events 9am on Wednesday and 9am on the following Monday. Hence, etcdbr still does not support all kinds of cron schedules yet.

Can you please add this assumption to the end-user documentation of etcd-backup-restore, so that users are aware of this current limitation and will write their cron schedule accordingly? Thanks

pkg/snapshot/snapshotter/snapshotter.go Outdated Show resolved Hide resolved
Fixed unit tests.
@gardener-robot-ci-3 gardener-robot-ci-3 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Feb 9, 2023
@gardener-robot-ci-2 gardener-robot-ci-2 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Feb 9, 2023
@gardener-robot-ci-3 gardener-robot-ci-3 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Feb 10, 2023
@ishan16696 ishan16696 force-pushed the enhance/startup-fullSnapCondition branch from b417d4c to 4562256 Compare February 10, 2023 11:49
@gardener-robot-ci-2 gardener-robot-ci-2 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Feb 10, 2023
@gardener-robot-ci-3 gardener-robot-ci-3 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Feb 10, 2023
@ishan16696 ishan16696 force-pushed the enhance/startup-fullSnapCondition branch from 4562256 to 4478894 Compare February 10, 2023 12:15
@gardener-robot-ci-2 gardener-robot-ci-2 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Feb 10, 2023
@gardener-robot-ci-1 gardener-robot-ci-1 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Feb 10, 2023
@ishan16696 ishan16696 force-pushed the enhance/startup-fullSnapCondition branch from 423743f to fd18b36 Compare February 10, 2023 12:43
@gardener-robot-ci-1 gardener-robot-ci-1 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Feb 10, 2023
Copy link
Collaborator

@shreyas-s-rao shreyas-s-rao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@gardener-robot gardener-robot added reviewed/lgtm Has approval for merging and removed needs/second-opinion Needs second review by someone else labels Feb 10, 2023
@ishan16696 ishan16696 merged commit 2d47113 into gardener:master Feb 13, 2023
@gardener-robot gardener-robot added the status/closed Issue is closed (either delivered or triaged) label Feb 13, 2023
@ishan16696 ishan16696 deleted the enhance/startup-fullSnapCondition branch February 13, 2023 10:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) reviewed/lgtm Has approval for merging size/l Size of pull request is large (see gardener-robot robot/bots/size.py) status/closed Issue is closed (either delivered or triaged)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enhance condition to take full snapshot during startup.
9 participants