Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable backups for existing cluster #182

Closed
micronax opened this issue Dec 17, 2018 · 12 comments · Fixed by #183 or #570
Closed

Enable backups for existing cluster #182

micronax opened this issue Dec 17, 2018 · 12 comments · Fixed by #183 or #570
Labels
Milestone

Comments

@micronax
Copy link

Is there any way to enable backups for a existing cluster or do I need to recreate the cluster with the backupSchedule specs?

At least the README is unclear at this point.

@HBO2
Copy link
Contributor

HBO2 commented Dec 18, 2018

@AMecea I am not sure, but I think the scheduled backup it still not working right? I am using the latest operator. and confirmed the backup it self is working, but scheduling in the cluster seems not to work because the cron trigger is doing noting:

....
spec:
  backupSchedule: 0 0 */20 * * *
  backupScheduleJobsHistoryLimit: 1
  backupSecretName: my-cluster-backup-secret
  backupUri: s3://backup/
....

@AMecea
Copy link
Contributor

AMecea commented Dec 18, 2018

Indeed this seems to be a bug, This part is not tested enough. I will fix this issue ASAP. Thanks for reporting it!

@AMecea AMecea added the bug label Dec 18, 2018
@AMecea AMecea added this to the 0.2.x milestone Dec 18, 2018
@micronax
Copy link
Author

OK, one more thing: How are backups supposed to work? I just followed all the instructions. A cluster-backup-bjob is created but does not run. Also, I cant see any output from any sidecar etc.

I setup a 5-min-schedule for testing: */5 * * * *

@AMecea
Copy link
Contributor

AMecea commented Dec 19, 2018

The schedule is formed from six fields (so you should have something like 0 */5 * * * *), check godoc of the cron library that we use.

And make sure that you are using the version v0.2.2. Please let me know if it works.

@micronax
Copy link
Author

OK, updated the schedule. Now waiting.

Do I need to upgrade the clusters as well? I upgraded the helm chart which triggered a successful recreation of the operator but the cluster itself stayed untouched.

@AMecea
Copy link
Contributor

AMecea commented Dec 19, 2018

Just the operator should be updated the cluster is fine. Maybe update the scheduler of the cluster if needed.

@micronax
Copy link
Author

It seems like it is not yet working :(

cluster.yaml

apiVersion: mysql.presslabs.org/v1alpha1
kind: MysqlCluster
metadata:
  name: cluster
spec:
  replicas: 2
  secretName: cluster-secret

  backupSchedule: '0 */5 * * * *'
  backupSecretName: cluster-backup-secret
  backupUri: gs://*****/
  backupScheduleJobsHistoryLimit: 10

  volumeSpec:
    accessModes: [ "ReadWriteOnce" ]
    resources:
      requests:
        storage: 20Gi

cluster-backup.yaml

apiVersion: mysql.presslabs.org/v1alpha1
kind: MysqlBackup
metadata:
  name: cluster-backup

spec:
  clusterName: cluster
  backupUri: gs://*****/
  backupSecretName: cluster-backup-secret

The secrets are also created how described in the README.

A few more debugging:

$ kubectl get mysqlbackup --namespace mysql
NAME             AGE
cluster-backup   8m

$ kubectl describe backup cluster-backup --namespace mysql
error: the server doesn't have a resource type "backup"

$ kubectl describe mysqlbackup --namespace mysql
Name:         cluster-backup
Namespace:    mysql
Labels:       <none>
Annotations:  kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"mysql.presslabs.org/v1alpha1","kind":"MysqlBackup","metadata":{"annotations":{},"name":"cluster-backup","namespace":"mysql"},"spec":{"ba...
API Version:  mysql.presslabs.org/v1alpha1
Kind:         MysqlBackup
Metadata:
  Creation Timestamp:  2018-12-19T15:15:33Z
  Generation:          1
  Resource Version:    4407591
  Self Link:           /apis/mysql.presslabs.org/v1alpha1/namespaces/mysql/mysqlbackups/cluster-backup
  UID:                 ee55435c-03a0-11e9-9623-901b0ed5b6c1
Spec:
  Backup Secret Name:  cluster-backup-secret
  Backup URL:          gs://*******/cluster-2018-12-19T15:15:33.xbackup.gz
  Backup Uri:          gs://*******/
  Cluster Name:        cluster
Status:
Events:
  Type    Reason              Age   From                    Message
  ----    ------              ----  ----                    -------
  Normal  JobSyncSuccessfull  9m    mysqlbackup-controller  *v1.Job mysql/cluster-backup-bjob created successfully

Misconfiguration or bug?

@AMecea
Copy link
Contributor

AMecea commented Dec 19, 2018

The config is fine. Can you show me some logs from the operator container, related to this? Does it work recurrent backups for a newly created cluster?

@AMecea
Copy link
Contributor

AMecea commented Dec 19, 2018

Also while testing on my end I found a bug the cleanup does not delete the right backups, it's deleting the newly created backups instead of the old ones. So don't use backupScheduleJobsHistoryLimit: 1 but instead use backupScheduleJobsHistoryLimit: null.

I will reopen this issue while this is solved.

@AMecea AMecea reopened this Dec 19, 2018
@micronax
Copy link
Author

Just found out that after the upgrade the whole setup is broken:

Orchestrator keeps throwing error messages like:

[martini] Started GET /api/cluster/cluster.mysql for 10.42.3.59:35542
2018-12-20 15:03:09 ERROR Unable to determine cluster name. clusterHint=cluster.mysql
[martini] Completed 500 Internal Server Error in 8.181376ms
[martini] Started GET /api/discover/cluster-mysql-0.cluster-mysql-nodes.mysql/3306 for 10.42.3.59:35542
[martini] Completed 500 Internal Server Error in 14.348477ms
[martini] Started GET /api/discover/cluster-mysql-1.cluster-mysql-nodes.mysql/3306 for 10.42.3.59:35542
2018-12-20 15:03:09 ERROR x509: certificate is valid for MySQL_Server_5.7.23-24_Auto_Generated_Server_Certificate, not cluster-mysql-0.cluster-mysql-nodes.mysql
2018-12-20 15:03:09 ERROR x509: certificate is valid for MySQL_Server_5.7.23-24_Auto_Generated_Server_Certificate, not cluster-mysql-1.cluster-mysql-nodes.mysql
[martini] Completed 500 Internal Server Error in 18.537356ms

Also the cluster lost its connection to the operator / orchestrator with:

[Note] Access denied for user 'orchestrator'@'10.42.4.139' (using password: YES)

while I never changed any passwords or secrets...

@AMecea
Copy link
Contributor

AMecea commented Dec 20, 2018

I know this issue, you have to use --reuse-values (in helm command) to keep the orchestrator credentials unchanged when doing an upgrade.

To fix this you can scale the cluster to 0 (replicas) then back to normal. This will reupdate the orchestrator credentials.

Please let me know if this works or you encounter any problems,

@micronax
Copy link
Author

OK thank you.. I already re-installed the mysql-operator. But now I get the following error during cluster init 🙈

--initialize specified but the data directory has files in it. Aborting.

I deleted the whole namespace, including volumes, secrets and config maps.

@calind calind modified the milestones: 0.2.x, 0.2.3 Jan 7, 2019
AMecea added a commit that referenced this issue Jan 7, 2019
AMecea added a commit that referenced this issue Jan 17, 2019
@AMecea AMecea closed this as completed in fb64694 Jan 17, 2019
usernameisnull pushed a commit to usernameisnull/mysql-operator that referenced this issue Mar 3, 2024
reset our envars at each start of a new build

it seems jenkins keeps values of envars in the case of a restarted build
so we could get a result like:
Status: changed, regression, unsuccessful, failure, changed, unsuccessful, unstable
see e.g., status of the build trunk bitpoke#182 which restarted from build bitpoke#181

Change-Id: I17b125de5306835b0cdad406c671b30317b75960
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants