-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support --all or specific node to manage cluster and nodes #797
Support --all or specific node to manage cluster and nodes #797
Conversation
14231a3
to
29fb483
Compare
dc0da69
to
23e3ac8
Compare
7d5c8ff
to
789c04a
Compare
789c04a
to
1bd5e3d
Compare
07f4d15
to
0735a45
Compare
d45ff5b
to
a40c968
Compare
314bad5
to
d413304
Compare
5b885cf
to
3696892
Compare
51786df
to
ab57fd6
Compare
ab57fd6
to
685c93a
Compare
e154460
to
a9500e7
Compare
a9500e7
to
a4fd1f9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the nice work, @liangxin1300!
# enable_quorum_fencing=0, enable_quorum_lockspace=0 for dlm config option | ||
if utils.is_dlm_configured(node_list[0]) and not utils.is_quorate(node_list[0]): | ||
logger.debug("Quorum is lost; Set enable_quorum_fencing=0 and enable_quorum_lockspace=0 for dlm") | ||
utils.set_dlm_option(peer=node_list[0], enable_quorum_fencing=0, enable_quorum_lockspace=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, both enable_quorum_fencing
and enable_quorum_lockspace
need to be disabled for an inquorate dlm to gracefully stop? Could there be any risks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, both.
from my opinion, there doesn't have any risk. because this action is only triggered by cmd "crm cluster stop".
before this patch, cluster (by dlm_controld) deny to do any further action until quorum is true.
after this patch, cluster will directly stop, there is no other behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The biggest concern is, given that this cluster partition is inquorate, there might a quorate partition standing... For example this partition has been split and inquorate, but somehow hasn't been fenced. If we are shutting down this cluster partition, and in case these settings make the inquorate partition be able to acquire the access to lockspace and corrupt data, that would be a disaster. We should never sacrifice data integrity even for graceful shutdown...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
enable_quorum_lockspace is disabled, this will make dlm lockspace related operation can keep going when the cluster quorum is lost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
your comment makes sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about it on the more safe/paranoid side, even if the user has confirmed to proceed, but allowing this simultaneously on multiple nodes is more like opening a Pandora's box... Since right after that, during stop, this cluster partition might spit apart into even more partitions...
If we go for it, we could ask user once, but we'd better proceed this specific procedure the serialized way: so
set_dlm_option -> stop dlm/pacemaker
for only one node at a time, and proceed for another node only after it has succeeded on this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My previous point stays with the last standing single node situation. Given that the node is inquorate already, the proposed behavior for crm cluster stop
(aka. crmsh -> set_dlm_option -> stop dlm/patcemaker) is just a "fencing" operation to protect data integrity, though no necessarily STONITH at the node level to do reboot.
Well, for the situation with multiple inquorate nodes, aka. multiple partitions, then the code here do have problem for '--all' situation. Simply because set_dlm_option only applies to one node. Not sure, if it is simple enough to address it in this PR, or open an issue to clarify this in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set_dlm_option is implemented by "dlm_tool set_config", which only run on a single node each time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set_dlm_option is implemented by "dlm_tool set_config", which only run on a single node each time.
Fine.
Given the above situation, when all nodes are inquorate, what's the suggested graceful shutdown steps internally for "stop --all" ? My reading of the code is it only change the current local node, no repeat the same on other nodes.
The situation is more fun, in theory for a big cluster, some nodes are quorate, some are not. Agree, it is a transient corner case. What's the suggested internal steps for "stop --all"? @zhaohem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe crmsh should never operate a cluster in the transient state at all by default? And ask user to answer Y/n, or use --force ?
for node in node_list[:]: | ||
if utils.service_is_active("pacemaker.service", remote_addr=node): | ||
logger.info("Cluster services already started on {}".format(node)) | ||
node_list.remove(node) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pacemaker.service being active doesn't necessarily mean corosync-qdevice.service is active as well, right? Should corosync-qdevice be checked for the full set of nodes as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When using bootstrap to setup cluster with qdevice, qdevice service will be enabled, so after reboot, corosync-qdevice will be started
@zzhou1 What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"node_list.remove(node)" can be wrong in the use case of normal stop/start, eg. when pacemaker.service is up but qdevice.service is not in the mean time. No necessary in the case of reboot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improve this in #898
|
||
bootstrap.start_pacemaker(node_list) | ||
if utils.is_qdevice_configured(): | ||
utils.start_service("corosync-qdevice", node_list=node_list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although there's no dependencies between corosync-qdevice and pacemaker, it'd be better to perform start of corosync-qdevice before pacemaker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bootstrap.start_pacemaker
here is to start pacemaker.service, according the dependency, corosync.service will be started firstly, then start qdevice does make sense; Otherwise, start qdevice without corosync started will be failed:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that corosync-qdevice.service has these defined as well:
Wants=corosync.service
After=corosync.service
I'd expect pure start of corosync-qdevice.service will resolve the dependency as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can get the point to move corosync-qdevice.service start operation before pacemaker. There is no harm to have multiple systemctl start, in theory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
if not node_list: | ||
return | ||
|
||
bootstrap.wait_for_cluster("Waiting for {} online".format(' '.join(node_list)), node_list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there special purpose of inserting this into the stop procedure and waiting for the listed nodes to be online?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
- Just after
crm cluster start --all
, the status of all nodes areUNCLEAN
, on this time, if executecrm cluster stop --all
, I found the stop process will be hang, and on some node, the output ofcrm_mon
will show all nodes' status arepending
- If all nodes already
Online
, inwait_for_cluster
function, I made a check firstly, and the lineWaiting for .... online
will be quiet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
I can understand the wait experience in bootstrap. However, this triggers me to think probably no necessary to force user to wait crm cluster start
, same for crm cluster stop
too. I would rather think to remove such kind of wait for both of them.
It makes sense to let crm cluster stop
directly abort if the criteria not meet. It's a reasonable error handling, rather than have sysadmin really wait in front of the screen.
For those scripts do want to 'wait', before we implement '--wait' option, the following example steps could help
crm cluster start
crm wait_for_startup
crm cluster stop
Well, my idea is debatable as the different flavor of the user experience. It is not a critical one I think.
action = context.get_command_name() | ||
utils.cluster_run_cmd("systemctl {} pacemaker.service".format(action), node_list) | ||
if utils.is_qdevice_configured(): | ||
utils.cluster_run_cmd("systemctl {} corosync-qdevice.service".format(action), node_list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically maybe it should be able to disable corosync-qdevice.service even if qdevice is not configured?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
corosync-qdevice.service will not be started if not configured, right? So I think to check if configured then do the action will no harm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it depends on how is_qdevice_configured() does the check. Of course one could remove qdevice configuration from corosync.conf before stopping the running corosync-qdevice.service...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I think I would naturally expect crm cluster stop
to stop corosync-qdevice.service even if corosync.conf has no qdevice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR to improve this: #895
To check if corosync-qdevice.service is available, not to check if it is configured
stop qdevice should no this issue, since I check if the service is active, then stop
node_list = parse_option_for_nodes(context, *args) | ||
for node in node_list[:]: | ||
if utils.is_standby(node): | ||
logger.info("Node %s already standby", node) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the node is in standby but with a different lifetime from the currently specified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both "reboot" and "forever" lifetime, the status from crm_mon
will be standby
,
And using crm node online
will revert the status for both cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, for example if an user intends to put a node into "forever standby" which is already in "reboot standby"...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, for example if an user intends to put a node into "forever standby" which is already in "reboot standby"...
Then here this action will be rejected and say "already standby", until online it firstly.
Do we need to support this change? I mean change the lifetime from "reboot" to "forever"?
Current code in production support that, bug in my view, that might cause confuse/conflicts, like:
- crm node standby reboot
- crm node standby
- crm node online
Then from crm_mon, we will see this node already instandby
status
To make these actions more clear, I think we should reject above action and say "already standby"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't check if it's already in any kind of standby :-). I'd just set what the user wants, as long as "online" command could bring it out of any cases/combinations of standby.
cib_str = xmlutil.xml_tostring(cib) | ||
for node in node_list: | ||
node_id = utils.get_nodeid_from_name(node) | ||
cib_str = re.sub(constants.STANDBY_NV_RE.format(node_id=node_id, value="on"), r'\1value="off"\2', cib_str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, a more reliable way would be do xpath queries and modify/delete
…ific node * crm cluster start/stop/restart [--all | <node>... ] * crm cluster enable/disable [--all | <node>... ] * crm node standby/online [--all | <node>... ]
When dlm configured and quorum is lost, before stop cluster service, should set enable_quorum_fencing=0, enable_quorum_lockspace=0 options
…ps/step_implementation.py
a4fd1f9
to
827cb5e
Compare
Motivation
Changes
to avoid dlm hanging
To avoid race condition for --all option, melt all standby/online values into one cib replace session