Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Nebula scheduled backup CRD #416

Merged
merged 11 commits into from
Jan 27, 2024
82 changes: 58 additions & 24 deletions apis/apps/v1alpha1/backupschedule_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,54 +20,88 @@ import (
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

var (
maxSuccessfulBackupJobsDef int32 = 3
maxSuccessfulFailedJobsDef int32 = 3
)

// +genclient
// +kubebuilder:object:root=true
// +kubebuilder:subresource:status
// +kubebuilder:resource:shortName="bs"
// +kubebuilder:resource:shortName="nsb"
// +kubebuilder:printcolumn:name="Schedule",type=string,JSONPath=`.spec.schedule`,description="The current schedule set for the scheduled backup"
// +kubebuilder:printcolumn:name="Pause",type=string,JSONPath=`.spec.pause`,description="Whether or not the scheduled backup is paused"
kevinliu24 marked this conversation as resolved.
Show resolved Hide resolved
// +kubebuilder:printcolumn:name="Last Triggered Backup",type=string,JSONPath=`.status.lastScheduledBackupTime`,description="The timestamp at which the last backup was triggered"
// +kubebuilder:printcolumn:name="Last Successful Backup",format=date-time,type=string,JSONPath=`.status.lastSuccessfulBackupTime`,description="The timestamp at which the last backup was successful completed"
// +kubebuilder:printcolumn:name="Age",type=date,JSONPath=`.metadata.creationTimestamp`

type BackupSchedule struct {
type NebulaScheduledBackup struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`

Spec BackupScheduleSpec `json:"spec,omitempty"`
Status BackupScheduleStatus `json:"status,omitempty"`
Spec ScheduledBackupSpec `json:"spec,omitempty"`
Status ScheduledBackupStatus `json:"status,omitempty"`
}

// +kubebuilder:object:root=true
// BackupScheduleList contains a list of BackupSchedule.
type BackupScheduleList struct {
// NebulaScheduledBackupList contains a list of NebulaScheduledBackup.
type NebulaScheduledBackupList struct {
metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata,omitempty"`

Items []BackupSchedule `json:"items"`
Items []NebulaScheduledBackup `json:"items"`
}

// BackupScheduleSpec contains the specification for a backupSchedule of a nebula cluster backupSchedule.
type BackupScheduleSpec struct {
// ScheduledBackupSpec contains the specification for a NebulaScheduledBackup of a nebula cluster NebulaScheduledBackup.
type ScheduledBackupSpec struct {
// Schedule specifies the cron string used for backup scheduling.
Schedule string `json:"schedule"`
// Pause means paused backupSchedule
kevinliu24 marked this conversation as resolved.
Show resolved Hide resolved
Pause bool `json:"pause,omitempty"`
// MaxBackups is to specify how many backups we want to keep
// 0 is magic number to indicate un-limited backups.
// if MaxBackups and MaxReservedTime are set at the same time, MaxReservedTime is preferred
Pause *bool `json:"pause,omitempty"`
// MaxBackups specifies how many backups we want to keep in the remote storage bucket.
// 0 is the magic number to indicate unlimited backups.
// if both MaxBackups and MaxReservedTime are set at the same time, MaxReservedTime will be used
// and MaxBackups is ignored.
MaxBackups *int32 `json:"maxBackups,omitempty"`
// MaxReservedTime is to specify how long backups we want to keep.
MaxReservedTime *string `json:"maxReservedTime,omitempty"`
// BackupTemplate is the specification of the backup structure to get scheduled.
// MaxRetentionTime specifies how long we want the backups in the remote storage bucket to be kept for.
Comment on lines +63 to +66
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be the case that if either of these two parameters meets the condition, it should be deleted? Instead of setting both parameters and using only one as the basis for judgement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current design is to use MaxReservedTime if both are specified as this is simpler understand for the customer and easier to implement. Otherwise we don't know whether the customer means "delete a backup if there are more then MaxBackups and the backup is older than MaxReservedTime" or "delete a backup if there are more then MaxBackups or if the backup if older than MaxReservedTime".

We can definitely think about which version we want to support and change the design to accommodate both parameters in a future version if there's customer demand for this. What do you think @MegaByte875

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's an example for a simple task: a log cleaning function. There are two settings: one for the maximum number of days to be retained, and one for the maximum storage to be kept. If a user configures both, how should we proceed?

I assume that the user wants to keep a certain number of days, but they also don't want to consume too much storage.

Returning to this issue, I think it's similar. By the way, if the user only wants to retain a certain number of days, why would they configure MaxBackups?

// +kubebuilder:validation:Pattern=`^([0-9]+(\.[0-9]+)?(s|m|h))+$`
MaxRetentionTime *string `json:"maxRetentionTime,omitempty"`
// BackupTemplate is the specification of the backup structure to schedule.
BackupTemplate BackupSpec `json:"backupTemplate"`
// LogBackupTemplate is the specification of the log backup structure to get scheduled.
// MaxSuccessfulNebulaBackupJobs specifies the maximum number of successful backup jobs to keep. Default 3.
MaxSuccessfulNebulaBackupJobs *int32 `json:"maxSuccessfulNebulaBackupJobs,omitempty"`
// MaxFailedNebulaBackupJobs specifies the maximum number of failed backup jobs to keep. Default 3
MaxFailedNebulaBackupJobs *int32 `json:"maxFailedNebulaBackupJobs,omitempty"`
Comment on lines +72 to +74
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these fields necessary? I believe users aren't particularly concerned with how many NebulaBackups need to be retained. This is precisely the task for NebulaScheduled. It determines retention based on MaxBackups and MaxRetentionTime.

Additionally, I think the lifecycle of NebulaBackup should be responsible for determining when the corresponding storage is deleted. In other words, when NebulaBackup is deleted, the corresponding storage should also be deleted, maintaining a consistent lifecycle.

NebulaScheduled only operates on NebulaBackup, deleting relevant resources when NebulaBackup is removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, MaxBackups and MaxRetentionTime is for determining how many backups to keep in either the S3 bucket or the gcp bucket. These parameters are there to specify how many backup jobs the user wants to keep as the user may want to keep the backup itself but not the actual job that created it. This also allows the user to keep all failed backup logs if needed. @MegaByte875 what do you think?

}

// ScheduledBackupStatus represents the current status of a nebula cluster NebulaScheduledBackup.
type ScheduledBackupStatus struct {
// CurrPauseStatus represent the current pause status of the nebula scheduled backup
CurrPauseStatus *bool `json:"currPauseStatus,omitempty"`
// LastBackup represents the last backup. Used for scheduled incremental backups. Not supported for now.
//LastBackup string `json:"lastBackup,omitempty"`
// LastScheduledBackupTime represents the last time a backup job was successfully scheduled.
LastScheduledBackupTime *metav1.Time `json:"lastScheduledBackupTime,omitempty"`
// LastSuccessfulBackupTime represents the last time a backup was successfully created.
LastSuccessfulBackupTime *metav1.Time `json:"lastSuccessfulBackupTime,omitempty"`
// NumberOfSuccessfulBackups represents the total number of successful Nebula Backups run by this the Nebula Scheduled Backup
NumberOfSuccessfulBackups *int32 `json:"numberofSuccessfulBackups,omitempty"`
// NumberOfFailedBackups represents the total number of failed Nebula Backups run by this the Nebula Scheduled Backup
NumberOfFailedBackups *int32 `json:"numberofFailedBackups,omitempty"`
// MostRecentJobFailed represents if the most recent backup job failed.
MostRecentJobFailed *bool `json:"mostRecentJobFailed,omitempty"`
}

// BackupScheduleStatus represents the current status of a nebula cluster backupSchedule.
type BackupScheduleStatus struct {
// LastBackup represents the last backup.
LastBackup string `json:"lastBackup,omitempty"`
// LastBackupTime represents the last time the backup was successfully created.
LastBackupTime *metav1.Time `json:"lastBackupTime,omitempty"`
// Defaulting implementation for ScheduledBackupStatus
func (nsb *NebulaScheduledBackup) Default() {
if nsb.Spec.MaxSuccessfulNebulaBackupJobs == nil {
nsb.Spec.MaxSuccessfulNebulaBackupJobs = &maxSuccessfulBackupJobsDef
}
if nsb.Spec.MaxFailedNebulaBackupJobs == nil {
nsb.Spec.MaxFailedNebulaBackupJobs = &maxSuccessfulFailedJobsDef
}
}

func init() {
SchemeBuilder.Register(&BackupSchedule{}, &BackupScheduleList{})
SchemeBuilder.Register(&NebulaScheduledBackup{}, &NebulaScheduledBackupList{})
}
8 changes: 4 additions & 4 deletions apis/apps/v1alpha1/nebulabackup_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@ type NebulaBackup struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`

Env []corev1.EnvVar `json:"env,omitempty"`

Spec BackupSpec `json:"spec,omitempty"`
Status BackupStatus `json:"status,omitempty"`
}
Expand All @@ -51,8 +53,6 @@ type NebulaBackupList struct {
type BackupConditionType string

const (
// BackupPending means the backup is pending, waiting for create backup job
BackupPending BackupConditionType = "Pending"
// BackupRunning means the backup is running.
BackupRunning BackupConditionType = "Running"
// BackupComplete means the backup has successfully executed and the
Expand Down Expand Up @@ -104,10 +104,10 @@ type BackupSpec struct {
type BackupStatus struct {
// TimeStarted is the time at which the backup was started.
// +nullable
TimeStarted metav1.Time `json:"timeStarted,omitempty"`
TimeStarted *metav1.Time `json:"timeStarted,omitempty"`
// TimeCompleted is the time at which the backup was completed.
// +nullable
TimeCompleted metav1.Time `json:"timeCompleted,omitempty"`
TimeCompleted *metav1.Time `json:"timeCompleted,omitempty"`
// Phase is a user readable state inferred from the underlying Backup conditions
Phase BackupConditionType `json:"phase,omitempty"`
// +nullable
Expand Down
3 changes: 3 additions & 0 deletions apis/apps/v1alpha1/nebularestore_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,9 @@ type BRConfig struct {
BackupName string `json:"backupName"`
// Concurrency is used to control the number of concurrent file downloads during data restoration.
Concurrency int32 `json:"concurrency,omitempty"`
// StorageProviderType specifies the type of storage backups should be stored in (currently only s3 is supported).
// +kubebuilder:validation:Pattern=`^(s3)$`
StorageProviderType string `json:"storageProviderType,omitempty"`
// StorageProvider configures where and how backups should be stored.
StorageProvider `json:",inline"`
}
Expand Down
Loading