-
Notifications
You must be signed in to change notification settings - Fork 982
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proxysql marking node as OFFLINE_HARD, adds it to offline HG and never recovers even though node is shown as ONLINE #3865
Comments
Hello, its been a month now since I opened this issue. Do you need more information? is this the wrong place for this type of thing? |
I am also facing the same issue with ProxySQL version 2.4.1-1-g1ea371d with Group Replication ! |
We see the same during a disaster recovery drills. A node fails proxysql monitor check and is put into OFFLINE_HARD from which it never recovers. I see no reason for a proxy software to never retry a node which we manually inserted into "mysql_servers" (and load to runtime ofc). Only relevant entries in the proxysql logs are:
|
I assume this is the log here: Well, there is a lot more relevant entries... |
@renecannao Anything that will explain why an offline server is never ever retried? |
@kasabov : what are the steps to reproduce this? |
We're still working out the exact steps to reproduce this. Happens with both ProxySQL version 2.4.5 and 2.5.5. This is from another reproduced case from today; the only relevant entries from 'monitor' database are lots of these in 'mysql_server_group_replication_log':
I'm not saying that the server was never retried. I'm asking why stop retrying. At some point the server is online, but I never see it anymore in the "runtime_mysql_servers" table. Am I wrongly assuming that it should have an entry there at all times (with any status)? The solution to this is manually reloading the mysql_servers to runtime. |
I am having exactly the same issue, all my nodes were put in the offline hg and never came back. After restart my proxysql pods everything worked again. |
You all still running 2.3.0 ? |
Closing this issue, it is absolutely outdated. If you are facing a similar issue, please follow "New issue" template and provide all the detailed required information. |
Hello,
we have a 3 node percona group replication cluster which we are accessing through proxysql.
1 Node is writer and reader 2 nodes are reader only.
The moment we run FLUSH TABLES against the underlying DB proxysql marks the node this is executed on as OFFLINE_HARD, moves it into the offline hostgroup and, what makes things worse, it never recovers that host.
The host is showing as ONLINE and viable candidate is true but it remains in the offline HG. The entry marked as OFFLINE_HARD in the reader HG eventually gets removed so just the one in the offline hostgroup remains.
Removing from mysql_servers and running LOAD TO RUNTIME does not fix the issue, the node gets readded to runtime with hostgroup still set to the offline HG. Restarting proxysql or the DB node seems to be the only thing that helps.
runtime_mysql_servers content reported from proxysql:
+--------------+---------------+------+-----------+--------------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| hostgroup_id | hostname | port | gtid_port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment |
+--------------+---------------+------+-----------+--------------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| 30 | proxysql.log | 3307 | 0 | OFFLINE_HARD | 1 | 0 | 1000 | 0 | 1 | 0 | |
| 40 | 1.2.3.1 | 3307 | 0 | ONLINE | 1 | 0 | 1000 | 0 | 1 | 0 | |
| 10 | 1.2.3.3 | 3307 | 0 | ONLINE | 1 | 0 | 1000 | 0 | 1 | 0 | |
| 30 | 1.2.3.3 | 3307 | 0 | ONLINE | 1 | 0 | 1000 | 0 | 1 | 0 | |
| 30 | 1.2.3.2 | 3307 | 0 | ONLINE | 1 | 0 | 1000 | 0 | 1 | 0 | |
+--------------+---------------+------+-----------+--------------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
proxysql version is proxysql-2.3.0-1.x86_64 (rpm obtained here: https://github.com/sysown/proxysql/releases/tag/v2.3.0)
same issue was observed with proxysql 2.2.0
OS is AlmaLinux release 8.5 (Arctic Sphynx)
percona version is 8.0.25-15
We can reliably reproduce this issue by running FLUSH TABLES on any of the 3 DB nodes.
proxysql.cnf.gz
proxysql.log.gz
The text was updated successfully, but these errors were encountered: