The clusterXL_monitor_ips Script

Description

You can use the clusterXL_monitor_ips script to ping a list of predefined IP addresses and change the state of the Cluster Member to DOWN or UP based on the replies to these pings. For this script to work, you must write the IP addresses in the $FWDIR/conf/cpha_hosts file - each IP address on a separate line. This file does not support comments or spaces.

Location of this script on your Cluster Members is:

$FWDIR/bin/clusterXL_monitor_ips

Script Workflow

  1. Registers a Critical Device called "host_monitor" with the status "ok".

  2. Starts to send pings to the list of predefined IP addresses in the $FWDIR/conf/cpha_hosts file.

  3. While the script receives responses to its pings, it does not change the status of that Critical Device.

  4. If the script does not receive a response to even one ping, it reports the state of that Critical Device as "problem".

    This gracefully changes the state of the Cluster Member to DOWN.

    If the script receives responses to its pings again, it changes the status of that Critical Device to "ok" again.

For more information, see sk35780.

Important - You must do these changes on all Cluster Members.

Example

#!/bin/sh
#
# The script tries to ping the hosts written in the file $FWDIR/conf/cpha_hosts. The names (must be resolveable) ot the IPs of the hosrs must be written in seperate lines.
# the file must not contain anything else.
# We ping the given hosts every number of seconds given as parameter to the script.
# USAGE:
# cpha_monitor_ips X silent
# where X is the number of seconds between loops over the IPs.
# if silent is set to 1, no messages will appear on the console
#
# We initially register a pnote named "host_monitor" in the problem notification mechanism
# when we detect that a host is not responding we report the pnote to be in "problem" state.
# when ping succeeds again - we report the pnote is OK.
 
silent=0
 
if [ -n "$2" ]; then
        if [ $2 -le 1 ]; then
                silent=$2
        fi
fi
hostfile=$FWDIR/conf/cpha_hosts
arch=`uname -s`
if [ $arch = "Linux" ]
then
        #system is linux
        ping="ping -c 1 -w 1"
else
        ping="ping"
fi
$FWDIR/bin/cphaconf set_pnote -d host_monitor -t 0 -s ok register
TRUE=1
while [ "$TRUE" ]
do
        result=1
        for hosts in `cat $hostfile`
        do
              if [ $silent = 0 ]
                then
                        echo "pinging $hosts using command $ping $hosts"
              fi
                if [ $arch = "Linux" ]
                then
                        $ping $hosts > /dev/null 2>&1
                else
                        $ping $hosts $1 > /dev/null 2>&1
                fi
                status=$?
                if [ $status = 0 ]
                then
                     if [ $silent = 0 ]
                        then
                            echo " $hosts is alive"
                     fi
                else
                     if [ $silent = 0 ]
                        then
                            echo " $hosts is not responding "
                     fi
                        result=0
                fi
        done
        if [ $silent = 0 ]
           then
                echo "done pinging"
        fi
        if [ $result = 0 ]
        then
              if [ $silent = 0 ]
              then
                  echo " Cluster member should be down!"
              fi
                $FWDIR/bin/cphaconf set_pnote -d host_monitor -s problem report
        else
              if [ $silent = 0 ]
              then
                        echo " Cluster member seems fine!"
              fi
                $FWDIR/bin/cphaconf set_pnote -d host_monitor -s ok report
        fi
        if [ "$silent" = 0 ]
        then
                echo "sleeping"
        fi
        sleep $1
        echo "sleep $1"
done