-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[discuss]: when a node in the etcd cluster fails, no error log is output #3937
Comments
The error log is right. If it reports the connection is refused, that means it fails to get any data from etcd at that moment. |
@Yiyiyimu @spacewander if we can get data from other etcd node, do we need print |
If it reports an error log in 18:26, it doesn't get data from another etcd node at that time. |
The error log happen forever when a node in the etcd cluster(3 nodes) fails, at that time ,apisix can get data from other etcd node and work corretcly. |
Interesting. @Firstsawyou |
Ok, let me investigate. |
@spacewander @Firstsawyou |
It does. Why do you want to configure a bad node inside APISIX? Start APISIX in an unhealthy situation is not a good idea. Consider one of your fail nodes has a wrong auth configuration which can't be detected if we just skip it. |
Because we want to be able to normally start and use apisix in a production environment in case of the |
Maybe the |
The As its name indicates, it does the init job. We need this operation to ensure the data in etcd is initialized correctly to avoid unexpected responses. If we skip this for some nodes, there is no way to ensure they are correctly initialized. As for "normally start and use apisix in a production environment in case of the You can use 3 virtual hosts for etcd and ensure they are mapping into healthy nodes. If it is no enough, you can introduce retry when starting APISIX. |
Why? ETCD is self-replicated. |
People might configure wrong node. Don't be surprise, it happened before. |
OK, got it ... |
Thx, got it. We should pay more attention to the HA of etcd |
Issue description
In an etcd cluster (3 nodes), when one of the nodes fails. The following error message will be printed in the error.log:
But this does not affect the normal operation of APISIX. Such error log information can make people misunderstand that etcd is unavailable. Can we not output error log information when a node fails in etcd cluster?
The text was updated successfully, but these errors were encountered: