A Brief Introduction to awk

"Taking a LiveFire Labs' course is an excellent way to learn Linux/Unix. The lessons are well thought out, the material is explained thoroughly, and you get to perform exercises on a real Linux/Unix box. It was money well spent."

Ray S.
Pembrook Pines, Florida

LiveFire Labs' UNIX and Linux Operating System Fundamentals course was very enjoyable. Although I regularly used UNIX systems for 16 years, I haven't done so since 2000. This course was a great refresher. The exercises were fun and helped me gain a real feel for working with UNIX/Linux OS. Thanks very much!"

Ming Sabourin
Senior Technical Writer
Nuance Communications, Inc.
Montréal, Canada

Read more student testimonials...

Receive UNIX Tips, Tricks, and Shell Scripts by Email

Custom Search

LiveFire Labs' UNIX Tip, Trick, or Shell Script of the Week

A Brief Introduction to awk

awk is a programmable, pattern-matching, and processing tool that works equally well with text and numbers. It derives its name from the first letter of the last name of its three authors (Alfred V. Aho, Peter J. Weinberger, and Brian W. Kernighan).

When used on the command line, the syntax for awk is:

awk 'pattern { action }' [filename]

A few things you should notice about the command line syntax are:

- since awk can take its input from its standard input, providing the name of a file is optional

- single quotes are used to protect the pattern and action from the shell

Although awk can be invoked with a script file (e.g. awk -f script-file filename), this overview will only demonstrate its command line usage.

The two key components of the command line format are the pattern (a regular expression), and the action. awk will search its input for lines that match the specified pattern. If it finds a match it will perform the specified action, which could be writing the entire line (record) or individual fields* from the line to standard output.

If a pattern is not specified, the action will be performed on every line. If the action is absent, then the default action of printing the line to standard output will be performed.

Our first example will demonstrate how awk can be used to select and manipulate specific records from a file, named unixfile, that contains the following data:

unix training
learn unix
unix class
learning unix
unix course

The following awk statement will select records from this file that contain the word "learn" in them, and then will print the two fields for each matching record in reverse order with a space in-between them:

# awk '/learn/ { print $2 " " $1 }' unixfile
unix learn
unix learning

Similar to shell script arguments, each field in a record can be referenced using the dollar sign followed by the number indicating its position in the record. $0 (dollar sign and zero) references the entire record, and is often used as an argument for various awk functions.

This week's second example will illustrate how awk can extract various pieces of information from lines of data that are piped to awk's standard input. In this example, output from the df (disk free) utility will be parsed:

# df -k | grep -v used | awk '{ print $6 "\t" $5 }'
/     27%
/usr     79%
/boot     17%
/proc     0%
/dev/fd     0%
/etc/mnttab     0%
/var     3%
/var/run     1%
/tmp     1%
/opt     2%
/export/home     1%

This statement extracts the filesystem mount point and percentage of available space (by filesystem) for every record awk receives from df. The grep command is used to discard df's column headers, and the "\t" in the print statement inserts a tab between the two fields. You may have noticed that a pattern was not specified, which resulted in the action being performed on every record.

This command, combined with an existing file containing your system's mount points and the available space thresholds for each filesystem, could be used to build a shell script to monitor your system's filesystem space usage and take a pre-defined action if a threshold was met. Scheduling the script to run at fixed intervals using the cron utility would provide you with an automated filesystem space usage monitoring solution.

* Record fields are delimited by one or more consecutive spaces or tabs.

Online UNIX Training with Hands-on Internet Lab

Learn UNIX from industry professionals

Practice on real servers

Study at work or home

24/7 global access to lab

Start learning within 24 hours

Receive UNIX Tips, Tricks, and Shell Scripts by Email

LiveFire Labs' UNIX Tip, Trick, or Shell Script of the Week

A Brief Introduction to awk