You can use the clusterXL_monitor_process script to monitor if the specified user space processes run, and cause cluster fail-over if these processes do not run. For this script to work, you must write the correct case-sensitive names of the monitored processes in the $FWDIR/conf/cpha_proc_list
file - each process name on a separate line. This file does not support comments or spaces.
Location of this script on your Cluster Members is: $FWDIR/bin/clusterXL_monitor_process
This shell script does these:
ok
) called as the names of the processes you specified in the $FWDIR/conf/cpha_proc_list
file.problem
. This gracefully changes the state of the Cluster Member to Down
. If the script detects that the specified process runs again, it changes the status of the corresponding Critical Device to ok
again.For more information, see sk92904.
Important - You must do these changes on all Cluster Members.
Example:
#!/bin/sh # # This script monitors the existance of processes in the system. The process names should be written # in the $FWDIR/conf/cpha_proc_list file one every line. # # USAGE : # cpha_monitor_process X silent # where X is the number of seconds between process probings. # if silent is set to 1, no messages will appear on the console. # # # We initially register a pnote for each of the monitored processes # (process name must be up to 15 charachters) in the problem notification mechanism. # when we detect that a process is missing we report the pnote to be in "problem" state. # when the process is up again - we report the pnote is OK.
if [ "$2" -le 1 ] then silent=$2 else silent=0 fi if [ -f $FWDIR/conf/cpha_proc_list ] then procfile=$FWDIR/conf/cpha_proc_list else echo "No process file in $FWDIR/conf/cpha_proc_list " exit 0 fi
arch=`uname -s`
for process in `cat $procfile` do $FWDIR/bin/cphaconf set_pnote -d $process -t 0 -s ok -p register > /dev/null 2>&1 done
while [ 1 ] do
result=1
for process in `cat $procfile` do ps -ef | grep $process | grep -v grep > /dev/null 2>&1
status=$?
if [ $status = 0 ] then if [ $silent = 0 ] then echo " $process is alive" fi # echo "3, $FWDIR/bin/cphaconf set_pnote -d $process -s ok report" $FWDIR/bin/cphaconf set_pnote -d $process -s ok report else if [ $silent = 0 ] then echo " $process is down" fi
$FWDIR/bin/cphaconf set_pnote -d $process -s problem report result=0 fi
done
if [ $result = 0 ]
then if [ $silent = 0 ] then echo " One of the monitored processes is down!" fi else if [ $silent = 0 ] then echo " All monitored processes are up " fi
fi if [ "$silent" = 0 ] then echo "sleeping" fi
sleep $1
done
|