Fixed rabbitmq cluster_status parsing when node list takes multiple lines. #290
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I was wondering why on some nodes of my rabbitmq cluster, chef would try to reconfigure the cluster_node_type at every chef-run, resulting in breaking all existing connections to the broker.
The actual reason was a subtle issue while parsing the output of
rabbitmqctl cluster_status
when the node list fits on multiple lines. Read on.How to reproduce
Have a cluster with enough nodes that the list of nodes or running_nodes in the
rabbitmqctl cluster_status
output takes more than one line, like this:In the above example, the
'rabbit@staging3-failover2'
entry is listed on a different line than the other two nodes.The following is the parsed version of the above cluster_status, taken from the chef debug log:
Note the extra space left of
'rabbit@staging3-failover2'
, after the comma.The same chef debug log displays the following list of disc nodes:
Note the space in the
" rabbit@staging3-failover2"
string ; this extra space prevents therabbit@staging3-failover2
node to be detected as a disc node (see below for the current implementation of thecurrent_cluster_node_type
method from the cluster provider).With a
node_name
value of"rabbit@staging3-failover2"
and the above parsed list of disc nodes (containing an invalid" rabbit@staging3-failover2"
entry with leading space), the node_name will never be identified as being a disc node and thevar_cluster_node_type
returned will be an empty string. The consequence is that thechange_cluster_node_type
will be reconfigured at every chef-run, resulting in gratuitous restart of rabbitmq connection.The fix
This PR fixes the issue in the cluster provider by making the
cluster_status
method ignore spaces that follow a newline when parsing therabbitmqctl cluster_status
output.No automated test but I believe that the PR should be quite safe.