UnixPedia : HPUX / LINUX / SOLARIS: HP-UX 11i Availability and Clustering - What are the Differences Between cmruncl, cmruncl -n, and cmrunnode Commands?

Tuesday, December 30, 2014

HP-UX 11i Availability and Clustering - What are the Differences Between cmruncl, cmruncl -n, and cmrunnode Commands?



HP-UX 11i Availability and Clustering - What are the Differences Between cmruncl, cmruncl -n, and cmrunnode Commands?

Information

HP-UX 11i, ServiceGuard, all versions.
Confused between the commands that can result in cluster starting and how to use them. Those commands are cmruncl, cmruncl using the -n option, and cmrunnode.
What are the differences between them and when are they used, and what are the different results of using each of those commands?

Details

Each of the commands cmruncl and cmrunnode starts the cluster daemon on one or more nodes. When each is used depends on the status of 1 or more nodes. The end result depends on how each command is used, on one or more nodes in the cluster. If the total number of nodes defined in the cluster is M, then the following holds true:
A. # cmrunnode -v
This command basically sends a request across the network and continues to try for 10 minutes (default value of this configurable timeout) looking for the cluster to which it belongs and its status by analyzing signals from other member nodes:
·         If the cluster is found to be running on one or more other nodes configured in its cluster, the request is interpreted as a request permission to join the cluster, and if all is OK, it is given permission and joins the cluster.
·         If ALL nodes in the cluster are executing the cmrunnode command at a given point in time, then it is decided that the cluster must be down on all nodes which means the cluster is not running anywhere (otherwise it has to be up at least on one of the member nodes, and that node cannot be executing cmrunnode to join a cluster). In that case, since all nodes are attempting to "join the cluster," it is decided to form a cluster from scratch on all nodes.
·         If N nodes, where 0 < N < M, are found to be executing the cmrunnode command at a given point in time, then unless the cluster is already running on a set of K other nodes, where 0 < K =< M-N, then each of the N nodes will give up 10 minutes after it started its cmrunnode. Another way to see this is that a cluster cannot be started on the N nodes from scratch (i.e., the cluster was not running on any subset of its M member nodes and attempted to start it on some subset or 1 or more nodes) by running the cmrunnode command on N nodes where N < M. The only way the cluster can be started from scratch using the cmrunnode command is, as described in the point 2 above, when the cluster is down everywhere and all M nodes are executing the cmrunnode at a given point in time.
This design is intended to safeguard against the "split brain syndrome" where the same cluster is simultaneously running on 2 subsets of the M nodes, which can lead to data corruption since the same packages are started on both subsets or "both clusters" and so more than one node can be writing to the same file system simultaneously, leading to data corruption.
Since the number 1 priority in High Availability system design is "data integrity", the above is made in line with that philosophy. Take for example, if it has an 8-node cluster that is running, and it reboots 7 of the nodes around the same time, where they hit the startup script of ServiceGuard, which is enabled to execute the cmrunnode automatically, within 10 minutes of each other.
The reboot action on the 7 nodes will lead to the cluster being reformed on the 8th node where it continues to run, and all (or some) of the packages will failover to that node. If the 7 nodes are able to reach the 8th node on the network, then they will start to join the cluster one by one and all is OK. However, suppose that right after rebooting the 7 nodes and before they hit the startup script, node8 became isolated network-wise. This may lead to some packages to go down (if they are monitoring the network). However, it is possible that some packages are either NOT monitoring subnet(s) (such as if those packages are not dependent on the subnet(s)). In this case, when the 7 nodes execute the cmrunnode, none of them will find node 8, and so no joining takes place. In this case, all 7 nodes will only find each other, but since they cannot find node 8, they cannot form the cluster on themselves according to the above rules. This is very helpful in this case, since if it is allowed to give up on node8 after some period of time and to form the cluster on themselves, it will end up having the cluster in a "split brain" since it will simultaneously be running on node8 as a cluster and also will be running separately on the other 7 nodes. This will lead to those packages that do not depend on the networks to run on both node8 as well as one of the nodes 1 though 7. This will lead to data corruption of those packages in any writing to file systems, which will simultaneously be mounted on both of those nodes running the package.
B. # cmruncl
This command is executable on any of the M nodes in the cluster. Once it is executed on one of them, it goes out on the network and attempts to start the cluster daemon on each of the M nodes to form a cluster on all of them. If the cluster fails to start on even just one of the M nodes, for any of the many possible reasons (such as the cluster is already running on one or more of the nodes, or one node is not reachable on the network, … etc), it will fail to start at all.
This command has one purpose only, which is to start the cluster from scratch on all M nodes. It will either succeed or will fail; there are no middle grounds.
C. # cmruncl -n <node_1_name> -n <node_2_name> ...... -n<node_N_name>
where 0 < N < M
The purpose of the above command is if the cluster is down on all M nodes, and for some reason would like to start the cluster on a subset N of the M nodes, the above command is executed. This command is followed by a confirmation question to make sure the user is aware of what the user is doing. If answered affirmatively, the action is taken; if not, the command is cancelled.
If it happens to be that the cluster is already running on another subset K of the M nodes, where 0 < K < M – N + 1, and where no member in K is a member of N, and the user answers the question affirmatively, the action is taken which results in a "split brain syndrome", where two versions of the cluster are running simultaneously on the subset N and the subnet K. This results in starting packages on both versions, and therefore, can lead to data corruption.
The user is cautioned strongly here to ensure that the cluster is NOT running on any such subset K before answering the question affirmatively to avoid such problems.

 

No comments:

Post a Comment