Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed rabbitmq cluster_status parsing when node list takes multiple lines. #290

Merged

Conversation

jperville
Copy link
Contributor

I was wondering why on some nodes of my rabbitmq cluster, chef would try to reconfigure the cluster_node_type at every chef-run, resulting in breaking all existing connections to the broker.

The actual reason was a subtle issue while parsing the output of rabbitmqctl cluster_status when the node list fits on multiple lines. Read on.

How to reproduce

Have a cluster with enough nodes that the list of nodes or running_nodes in the rabbitmqctl cluster_status output takes more than one line, like this:

root@staging3-failover2:~# rabbitmqctl cluster_status
Cluster status of node 'rabbit@staging3-failover2' ...
[{nodes,[{disc,['rabbit@staging3-backends','rabbit@staging3-failover1',
                'rabbit@staging3-failover2']}]},
 {running_nodes,['rabbit@staging3-backends','rabbit@staging3-failover1',
                 'rabbit@staging3-failover2']},
 {cluster_name,<<"staging3">>},
 {partitions,[]}]

In the above example, the 'rabbit@staging3-failover2' entry is listed on a different line than the other two nodes.

The following is the parsed version of the above cluster_status, taken from the chef debug log:

[2015-07-27T09:32:41+00:00] DEBUG: [rabbitmq_cluster] rabbitmqctl cluster_status : [{nodes,[{disc,['rabbit@staging3-backends','rabbit@staging3-failover1', 'rabbit@staging3-failover2']}]}, {running_nodes,['rabbit@staging3-backends','rabbit@staging3-failover1', 'rabbit@staging3-failover2']}, {cluster_name,<<"staging3">>}, {partitions,[]}]

Note the extra space left of 'rabbit@staging3-failover2', after the comma.

The same chef debug log displays the following list of disc nodes:

[2015-07-27T09:32:41+00:00] DEBUG: [rabbitmq_cluster] disc_nodes : ["rabbit@staging3-backends", "rabbit@staging3-failover1", " rabbit@staging3-failover2"]

Note the space in the " rabbit@staging3-failover2" string ; this extra space prevents the rabbit@staging3-failover2 node to be detected as a disc node (see below for the current implementation of the current_cluster_node_type method from the cluster provider).

# Get cluster_node_type of current node
def current_cluster_node_type(node_name, cluster_status)
  var_cluster_node_type = ''
  if disc_nodes(cluster_status).include?(node_name) # <--- the extra space breaks this test!
    var_cluster_node_type = 'disc'
  elsif ram_nodes(cluster_status).include?(node_name)
    var_cluster_node_type = 'ram'
  end
  Chef::Log.debug("[rabbitmq_cluster] current cluster node type : #{var_cluster_node_type}")
  var_cluster_node_type
end

With a node_name value of "rabbit@staging3-failover2" and the above parsed list of disc nodes (containing an invalid " rabbit@staging3-failover2" entry with leading space), the node_name will never be identified as being a disc node and the var_cluster_node_type returned will be an empty string. The consequence is that the change_cluster_node_type will be reconfigured at every chef-run, resulting in gratuitous restart of rabbitmq connection.

The fix

This PR fixes the issue in the cluster provider by making the cluster_status method ignore spaces that follow a newline when parsing the rabbitmqctl cluster_status output.

No automated test but I believe that the PR should be quite safe.

@kbogtob
Copy link

kbogtob commented Jul 29, 2015

Seems okay for me. +1

jjasghar pushed a commit that referenced this pull request Aug 14, 2015
…status

Fixed rabbitmq cluster_status parsing when node list takes multiple lines.
@jjasghar jjasghar merged commit c236683 into rabbitmq:master Aug 14, 2015
@jperville jperville deleted the fix-parsing-multiline-cluster-status branch August 17, 2015 08:56
@jperville jperville restored the fix-parsing-multiline-cluster-status branch August 17, 2015 08:56
@jperville
Copy link
Contributor Author

Thank you for the merge @jjasghar. Waiting for new official release impatiently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants