The clusterXL_monitor_process script can be used to monitor the existence of a given processes and cause cluster fail-over if the processes die. For this script to work, you must write the names of the monitored processes in the $FWDIR/conf/cpha_proc_list
file - one name on a separate line (comments or spaces are not allowed in this file).
This shell script registers pnotes (called as the names of processes you specified in the $FWDIR/conf/cpha_proc_list
file) and gracefully changes the state of the given cluster member to Down
(by reporting the state of that pnote as problem
), or gracefully reverts the state of the given cluster member to Up
(by reporting the state of that pnote as ok
).
The clusterXL_monitor_process script is located in the $FWDIR/bin/ directory on your cluster members.
For more information, see sk92904.
Important - You must do these changes on all cluster members.
Example:
#!/bin/sh # This script monitors the existence of processes in the system. # The process names should be written in the # $FWDIR/conf/cpha_proc_list file one every line. # USAGE : # cpha_monitor_process X silent # where X is the number of seconds between process probings. # if silent is set to 1, no messages will appear on the console. # We initially register a pnote for each of the monitored processes # (process name must be up to 15 characters) in the problem notification mechanism. # when we detect that a process is missing we report the pnote to be in "problem" state. # when the process is up again - we report the pnote is OK. if [ "$2" -le 1 ] then silent=$2 else silent=0 fi if [ -f $FWDIR/conf/cpha_proc_list ] then procfile=$FWDIR/conf/cpha_proc_list else echo "No process file in $FWDIR/conf/cpha_proc_list " exit 0 fi arch=`uname -s` for process in `cat $procfile` do $FWDIR/bin/cphaprob -d $process -t 0 -s ok -p register > /dev/null 2>&1 done while [ 1 ] do result=1 for process in `cat $procfile` do ps -ef | grep $process | grep -v grep > /dev/null 2>&1 status=$? if [ $status = 0 ] then if [ $silent = 0 ] then echo " $process is alive" fi # echo "3, $FWDIR/bin/cphaprob -d $process -s ok report" $FWDIR/bin/cphaprob -d $process -s ok report else if [ $silent = 0 ] then echo " $process is down" fi $FWDIR/bin/cphaprob -d $process -s problem report result=0 fi done if [ $result = 0 ] then if [ $silent = 0 ] then echo " One of the monitored processes is down!" fi else if [ $silent = 0 ] then echo " All monitored processes are up " fi fi if [ "$silent" = 0 ] then echo "sleeping" fi sleep $1 done
|