"Taking a LiveFire Labs' course is an excellent way to learn
Linux/Unix. The lessons are well thought out, the material is
explained thoroughly, and you get to perform exercises on a real
Linux/Unix box. It was money well spent."
Ray S.
Pembrook Pines, Florida
LiveFire Labs' UNIX and Linux Operating System Fundamentals
course was very enjoyable. Although I regularly used UNIX systems
for 16 years, I haven't done so since 2000. This course was a
great refresher. The exercises were fun and helped me gain a real
feel for working with UNIX/Linux OS. Thanks very much!"
Ming Sabourin
Senior Technical Writer
Nuance Communications, Inc.
Montréal, Canada
Read more student testimonials...
Receive UNIX Tips, Tricks, and Shell Scripts by Email
LiveFire Labs' UNIX Tip,
Trick, or Shell Script of the Week
A Brief Introduction to awk
awk is a programmable, pattern-matching, and processing tool
that works equally well with text and numbers. It derives its name
from the first letter of the last name of its three authors (Alfred V.
Aho, Peter J. Weinberger, and Brian W. Kernighan).
When used on the command line, the syntax for awk is:
awk 'pattern { action }' [filename]
A few things you should notice about the command line syntax are:
- since awk can take its input from its standard input,
providing the name of a file is optional
- single quotes are used to protect the pattern and action from the
shell
Although awk can be invoked with a script file (e.g. awk -f
script-file filename), this overview will only demonstrate its command
line usage.
The two key components of the command line format are the pattern (a
regular expression), and the action. awk will search its input for
lines that match the specified pattern. If it finds a match it will
perform the specified action, which could be writing the entire line
(record) or individual fields* from the line to standard output.
If a pattern is not specified, the action will be performed on every
line. If the action is absent, then the default action of printing
the line to standard output will be performed.
Our first example will demonstrate how awk can be used to select and
manipulate specific records from a file, named unixfile, that contains
the following data:
unix training
learn unix
unix class
learning unix
unix course
The following awk statement will select records from this file that
contain the word "learn" in them, and then will print the two fields
for each matching record in reverse order with a space in-between
them:
# awk '/learn/ { print $2 " " $1 }' unixfile
unix learn
unix learning
Similar to shell script arguments, each field in a record can be
referenced using the dollar sign followed by the number indicating its
position in the record. $0 (dollar sign and zero) references the
entire record, and is often used as an argument for various awk
functions.
This week's second example will illustrate how awk can extract various
pieces of information from lines of data that are piped to awk's
standard input. In this example, output from the df (disk free)
utility will be parsed:
# df -k | grep -v used | awk '{ print $6 "\t" $5 }'
/ 27%
/usr 79%
/boot 17%
/proc 0%
/dev/fd 0%
/etc/mnttab 0%
/var 3%
/var/run 1%
/tmp 1%
/opt 2%
/export/home 1%
This statement extracts the filesystem mount point and percentage of
available space (by filesystem) for every record awk receives from
df. The grep command is used to discard df's column headers, and the
"\t" in the print statement inserts a tab between the two fields. You
may have noticed that a pattern was not specified, which resulted in
the action being performed on every record.
This command, combined with an existing file containing your system's
mount points and the available space thresholds for each filesystem,
could be used to build a shell script to monitor your system's
filesystem space usage and take a pre-defined action if a threshold
was met. Scheduling the script to run at fixed intervals using the
cron utility would provide you with an automated filesystem space
usage monitoring solution.
* Record fields are delimited by one or more consecutive spaces or
tabs.