STONITH is short for "Shoot The Other Node In The Head". It's a technique for fencing in computer clusters.
Fencing is the isolation of a failed node so that it does not cause disruption to a computer cluster. As its name suggests, STONITH fences failed nodes by resetting or powering down the failed node.
Multi-node error-prone contention in a cluster can have catastrophic results, such as if both nodes try writing to a shared storage resource. STONITH provides effective, if rather drastic, protection against these problems.
If you try to commit the crm configuration or directly configure the cluster, you will get error regarding STONITH if it's not configured.
The errors can be verified also with the following command, prior to commit:
crm_verify -L -V
Work log:
root@candy:~# crm_verify -L -V crm_verify[2963]: 2014/11/27_17:55:42 ERROR: unpack_resources: Resource start-up disabled since no STONITH resources have been defined crm_verify[2963]: 2014/11/27_17:55:42 ERROR: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option crm_verify[2963]: 2014/11/27_17:55:42 ERROR: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity Errors found during check: config not valid
The error is explaining: In order to guarantee the safety of your data [8] , the default for STONITH [9] in Pacemaker is enabled. However it also knows when no STONITH configuration has been supplied and reports this as a problem (since the cluster would not be able to make progress if a situation requiring node fencing arose).
You have two options here and I recommend to configure STONITH instead of disabling it but it is up to you. I will present you below both options.
To disable STONITH, we set the stonith-enabled cluster option to false:
crm configure property stonith-enabled=false
crm_verify -L -V
work log:
root@candy:~# crm configure property stonith-enabled=false root@candy:~# crm_verify -L -V root@candy:~# <- NO OUTPUT IS OK
To configure STONITH (recommended), we do the following:
crm configure property stonith-enabled=true
crm configure property stonith-action=poweroff
crm configure rsc_defaults resource-stickiness=100
crm configure property no-quorum-policy=ignore
crm configure primitive stonith_rg stonith:external/ssh params hostlist="eave candy"
crm configure clone fencing_rg stonith_rg
crm_mon -1
work log:
root@candy:~# crm configure property stonith-enabled=true root@candy:~# crm configure property stonith-action=poweroff root@candy:~# crm configure rsc_defaults resource-stickiness=100 root@candy:~# crm configure property no-quorum-policy=ignore root@candy:~# crm configure primitive stonith_rg stonith:external/ssh params hostlist="eave candy" ERROR: stonith_rg: parameter candy does not exist Do you still want to commit? yes root@candy:~# crm configure clone fencing_rg stonith_rg root@candy:~# crm_mon -1 ============ Last updated: Thu Nov 27 18:20:54 2014 Last change: Thu Nov 27 18:20:51 2014 via cibadmin on candy Stack: Heartbeat Current DC: eave (ddfe44df-4fc3-4cbf-afc1-d4846465a920) - partition with quorum Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff 2 Nodes configured, unknown expected votes 2 Resources configured. ============ Node eave (ddfe44df-4fc3-4cbf-afc1-d4846465a920): pending Node candy (ea078cb1-708c-4ae4-b7b7-5d4322422b3e): pending
root@eave:~# crm configure property stonith-enabled=true root@eave:~# crm configure property stonith-action=poweroff root@eave:~# crm configure rsc_defaults resource-stickiness=100 root@eave:~# crm configure property no-quorum-policy=ignore root@eave:~# crm configure primitive stonith_rg stonith:external/ssh params hostlist="eave candy" ERROR: stonith_rg: id is already in use root@eave:~# crm configure clone fencing_rg stonith_rg ERROR: fencing_rg: id is already in use root@eave:~# crm_mon -1 ============ Last updated: Thu Nov 27 18:23:24 2014 Last change: Thu Nov 27 18:20:51 2014 via cibadmin on candy Stack: Heartbeat Current DC: eave (ddfe44df-4fc3-4cbf-afc1-d4846465a920) - partition with quorum Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff 2 Nodes configured, unknown expected votes 2 Resources configured. ============ Online: [ eave candy ] Clone Set: fencing_rg [stonith_rg] Started: [ candy eave ]
Then restart the cluster on both nodes and check the status:
service heartbeat restart
crm_mon --one-shot
work log:
root@candy:~# service heartbeat restart Stopping High-Availability services: Done. Starting High-Availability services: Done. root@candy:~# crm_mon --one-shot ============ Last updated: Thu Nov 27 16:28:29 2014 Last change: Thu Nov 27 12:40:45 2014 via crmd on eave Stack: Heartbeat Current DC: eave (ddfe44df-4fc3-4cbf-afc1-d4846465a920) - partition with quorum Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff 2 Nodes configured, unknown expected votes 0 Resources configured. ============ Online: [ eave candy ]
root@eave:~# pico /etc/heartbeat/haresources root@eave:~# service heartbeat restart Stopping High-Availability services: Done. Starting High-Availability services: Done. root@eave:~#