Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replicas are never catch up after pod is recreated #588

Open
ynnt opened this issue Aug 13, 2020 · 4 comments
Open

Replicas are never catch up after pod is recreated #588

ynnt opened this issue Aug 13, 2020 · 4 comments

Comments

@ynnt
Copy link

ynnt commented Aug 13, 2020

Hello!
I have an issue with mysql cluster not catching up after failover happens.

Cluster spec:

  spec:
    image: percona:5.7.29-centos
    podSpec:
      metricsExporterResources: {}
      mysqlOperatorSidecarResources: {}
      resources: {}
    replicas: 2
    secretName: my-cluster-secret
    volumeSpec:
      persistentVolumeClaim:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 1Gi

After I delete mysql-0 pod, mysql-1 starts complaining it can't sync with it:

//my-cluster-mysql-0.mysql.default:3306' - retry-time: 1  retries: 162, Error_code: 2005
2020-08-13T16:37:48.525821Z 939 [ERROR] Slave I/O for channel '': error connecting to master 'sys_replication@//my-cluster-mysql-0.mysql.default:3306' - retry-time: 1  retries: 163, Error_code: 2005
2020-08-13T16:37:49.526148Z 939 [ERROR] Slave I/O for channel '': error connecting to master 'sys_replication@//my-cluster-mysql-0.mysql.default:3306' - retry-time: 1  retries: 164, Error_code: 2005

Even when mysql-0 is recreated and successfully starts replication connection to mysql-1.

I wonder if using a pod name like my-cluster-mysql-0.mysql.default is an issue? Its IP address changes after pod is recreated.
Is it possible to experiment using services per each mysql pod for replication connection?

@ynnt ynnt changed the title Replicas are never catch up Replicas are never catch up after pod is recreated Aug 13, 2020
@AMecea
Copy link
Contributor

AMecea commented Aug 28, 2020

Hi @ynnt, When mysql-0 is stopped the mysql-1 is promoted as master and is detached from the old master.

See that the host (sys_replication@//my-cluster-mysql-0.mysql.default:3306) contains a //, in the name, that means that is detached. This is how Orchestrator operates.

My question is does mysql-0 become a replica to mysql-1? If yes then you can ignore all of those errors. Please connect to orchestrator and see there if everything is fine: https://www.presslabs.com/docs/mysql-operator/orchestrator/

@ynnt
Copy link
Author

ynnt commented Sep 11, 2020

Hey!
It can't become a replica, because it tries to connect using old mysql-1 IP address. Orchestrator reports the replication problem.

show slave status; shows this:
Slave_IO_State: Connecting to master

@ynnt
Copy link
Author

ynnt commented Sep 11, 2020

Also, are there any ways to start mysql with hostnetwork?

@munjalpatel
Copy link

@ynnt @AMecea I am also running into the same issue. @ynnt did you ever managed to resolve this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants