Print Download PDF Send Feedback

Previous

Appendix C - The clusterXL_monitor_process Script

The clusterXL_monitor_process script can be used to monitor the existence of a given processes and cause cluster fail-over if the processes die. For this script to work, you must write the names of the monitored processes in the $FWDIR/conf/cpha_proc_list file - one name on a separate line (comments or spaces are not allowed in this file).

This shell script registers pnotes (called as the names of processes you specified in the $FWDIR/conf/cpha_proc_list file) and gracefully changes the state of the given cluster member to Down (by reporting the state of that pnote as problem), or gracefully reverts the state of the given cluster member to Up (by reporting the state of that pnote as ok).

The clusterXL_monitor_process script is located in the $FWDIR/bin/ directory on your cluster members.

For more information, see sk92904.

Important - You must do these changes on all cluster members.

Example:

#!/bin/sh

# This script monitors the existence of processes in the system.

# The process names should be written in the

# $FWDIR/conf/cpha_proc_list file one every line.

# USAGE :

# cpha_monitor_process X silent

# where X is the number of seconds between process probings.

# if silent is set to 1, no messages will appear on the console.

# We initially register a pnote for each of the monitored processes

# (process name must be up to 15 characters) in the problem notification mechanism.

# when we detect that a process is missing we report the pnote to be in "problem" state.

# when the process is up again - we report the pnote is OK.

if [ "$2" -le 1 ]

then

silent=$2

else

silent=0

fi

if [ -f $FWDIR/conf/cpha_proc_list ]

then

procfile=$FWDIR/conf/cpha_proc_list

else

echo "No process file in $FWDIR/conf/cpha_proc_list "

exit 0

fi

arch=`uname -s`

for process in `cat $procfile`

do

$FWDIR/bin/cphaprob -d $process -t 0 -s ok -p register > /dev/null 2>&1

done

while [ 1 ]

do

result=1

for process in `cat $procfile`

do

ps -ef | grep $process | grep -v grep > /dev/null 2>&1

status=$?

if [ $status = 0 ]

then

if [ $silent = 0 ]

then

echo " $process is alive"

fi

# echo "3, $FWDIR/bin/cphaprob -d $process -s ok report"

$FWDIR/bin/cphaprob -d $process -s ok report

else

if [ $silent = 0 ]

then

echo " $process is down"

fi

$FWDIR/bin/cphaprob -d $process -s problem report

result=0

fi

done

if [ $result = 0 ]

then

if [ $silent = 0 ]

then

echo " One of the monitored processes is down!"

fi

else

if [ $silent = 0 ]

then

echo " All monitored processes are up "

fi

fi

if [ "$silent" = 0 ]

then

echo "sleeping"

fi

sleep $1

done