Print Download PDF Send Feedback

Previous

Next

The clusterXL_monitor_process Script

You can use the clusterXL_monitor_process script to monitor if the specified user space processes run, and cause cluster fail-over if these processes do not run. For this script to work, you must write the correct case-sensitive names of the monitored processes in the $FWDIR/conf/cpha_proc_list file - each process name on a separate line. This file does not support comments or spaces.

Location of this script on your Cluster Members is: $FWDIR/bin/clusterXL_monitor_process

This shell script does these:

  1. Registers Critical Devices (with status ok) called as the names of the processes you specified in the $FWDIR/conf/cpha_proc_list file.
  2. While the script detects that the specified process runs, it does not change the status of the corresponding Critical Device.
  3. If the script detects that the specified process do not run anymore, it reports the state of the corresponding Critical Device as problem. This gracefully changes the state of the Cluster Member to Down. If the script detects that the specified process runs again, it changes the status of the corresponding Critical Device to ok again.

For more information, see sk92904.

Important - You must do these changes on all Cluster Members.

Example:

#!/bin/sh

#

# This script monitors the existance of processes in the system. The process names should be written

# in the $FWDIR/conf/cpha_proc_list file one every line.

#

# USAGE :

# cpha_monitor_process X silent

# where X is the number of seconds between process probings.

# if silent is set to 1, no messages will appear on the console.

#

#

# We initially register a pnote for each of the monitored processes

# (process name must be up to 15 charachters) in the problem notification mechanism.

# when we detect that a process is missing we report the pnote to be in "problem" state.

# when the process is up again - we report the pnote is OK.

 

if [ "$2" -le 1 ]

then

silent=$2

else

silent=0

fi

if [ -f $FWDIR/conf/cpha_proc_list ]

then

procfile=$FWDIR/conf/cpha_proc_list

else

echo "No process file in $FWDIR/conf/cpha_proc_list "

exit 0

fi

 

arch=`uname -s`

 

for process in `cat $procfile`

do

$FWDIR/bin/cphaconf set_pnote -d $process -t 0 -s ok -p register > /dev/null 2>&1

done

 

while [ 1 ]

do

 

result=1

 

for process in `cat $procfile`

do

ps -ef | grep $process | grep -v grep > /dev/null 2>&1

 

status=$?

 

if [ $status = 0 ]

then

if [ $silent = 0 ]

then

echo " $process is alive"

fi

# echo "3, $FWDIR/bin/cphaconf set_pnote -d $process -s ok report"

$FWDIR/bin/cphaconf set_pnote -d $process -s ok report

else

if [ $silent = 0 ]

then

echo " $process is down"

fi

 

$FWDIR/bin/cphaconf set_pnote -d $process -s problem report

result=0

fi

 

done

 

if [ $result = 0 ]

 

then

if [ $silent = 0 ]

then

echo " One of the monitored processes is down!"

fi

else

if [ $silent = 0 ]

then

echo " All monitored processes are up "

fi

 

fi

if [ "$silent" = 0 ]

then

echo "sleeping"

fi

 

sleep $1

 

done