|
June
16, 2003 -
Using
the ping Command to Develop an Inexpensive UNIX
Monitoring Solution
|
If
you manage a multi-system UNIX environment
in which system availability (uptime) is
closely linked to the success of your
organization, you will need to deploy some
type of automated solution to monitor each
system's availability, and possibly the
availability of other mission-critical
equipment. If
funds are unavailable to purchase an
externally developed UNIX monitoring
solution, there are various native UNIX
commands that can be used to handle the
job. One of the more popular
solutions is to embed the ping command
within a shell script, and then schedule
the shell script to run at fixed intervals
using cron.
If you've been exposed to UNIX networking
and/or TCP/IP, you most likely have already
used the ping command. If you
haven't, ping is used to test the
availability of another system (host) on
the network. ping will send a
special data packet from the local host to
the target host, and then the target host will send an
"I'm alive" packet back if it is available. The
basic syntax for ping is:
|
ping host
|
where host can
be either an IP hostname or address.
Your site-specific requirements coupled
with your shell scripting skills will
determine the level of complexity of your
system monitoring script. A
relatively simple script may loop through
a list of hosts stored in an external
file, pinging each host one at a time, and
then performing one or more actions if the
host does not respond to the ping packet.
|
HOW TO KNOW
WHEN A HOST DOES NOT RESPOND...
The shell script can determine whether or
not the host responded by checking the
ping command's return code. If you
are unfamiliar with return codes, UNIX
commands will have a return code of 0 if
the command was successful and will return
a non-zero number if the command failed
for whatever reason. A command's
return code is stored in $?, and has to be
checked immediately after the command is
executed:
|
ping ${TARGETHOST}
3
if [ $? -ne 0 ]
then
# action 1
# action 2
# action n
fi
|
In this snippet of code, the 3 after the
target host indicates how many packets
should be sent to it. Depending on
your version of UNIX, you may need to
specify an option (e.g. "-c" for
count) and then a number to accomplish this.
|
If you wanted to give your monitoring script
a little more intelligence, you could
include a second field in the external hosts
file to control when and what actions are
taken for an unavailable host. For
example, your file may look like this:
|
unixserverA 0
unixserverB 0
unixserverC 0
networkdeviceA 0
networkdeviceB 0
nonunixserverA 0
nonunixserverB 0
|
The second field
could contain a number from 0 to 2 (or
higher if you'd like), and the action(s) taken
by the monitoring script would be dependent
on what this number is.
For example, 0 could indicate that the
target host was available the last time it
was checked, 1 could indicate that the
target host was unavailable last time it was
checked and an email was sent, and 2 could
indicate that the target host was
unavailable the last two times it was
checked and an email and text pager message
have already been sent out and no further action should
be taken until the host becomes available
again.
Once the host responds to the ping packet,
this number can be reset to 0 and another
text pager message indicating the host's
availability status can be sent out.
Implementing this level of intelligence may
eliminate some late night pages due to
short-lived network interruptions.
This is just one example of how a native
UNIX command embedded in a shell script can
be used to build an inexpensive UNIX
monitoring solution from scratch. This
may be sufficient for your environment, or
may require some additional
features/intelligence to meet your needs. Either way, it's much
less expensive than purchasing an externally
developed solution, and can be tweaked to be
just as good or better.
|
|
Learn
more...
If you are new to the UNIX or Linux
operating system and would like to learn
more about other frequently-used operating
system commands, you may want to consider
registering for LiveFire Labs' UNIX
and Linux Operating System Fundamentals
online training course.
If you already have a solid grasp of the
fundamentals but would like to learn more
about the Korn shell and basic and
advanced shell scripting, taking our Korn
Shell Scripting course will be
beneficial to you.
Our innovative hands-on training model
allows you to learn
UNIX by completing hands-on
exercises on real servers in our Internet
Lab.
More
Tips...
· Popular
UNIX Tips from the Past
|
|
|
|
|
|
|
Receive
the UNIX Tip, Trick, or Shell Script of the
Week by Email
|
|
|