Search My Blog

Tuesday, November 9, 2010

Introduction and Examples of the awk Command

Introduction to awk

by mike on November 5, 2010

Awk is a pattern-scanning and text processing utility that captures information from text files creating reports in the process, modify files from one format to another, create databases and perform mathematical operations on data.  The term “awk” comes from the names of the authors, Aho, Weinberger and Kernighan.  Nawk is the newer version of awk and Gawk is the Gnu version.  Often awk is a symbolic link to gawk as you can see on this CentOS machine.

awk –version
GNU Awk 3.1.5
Copyright (C) 1989, 1991-2005 Free Software Foundation.

ls -l /bin/awk
lrwxrwxrwx 1 root root 4 May 22 07:43 /bin/awk -> gawk

When awk is used data can be sent from standard input (stdin is in put from the keyboard), files, or from output of another process.  When you engage awk it scans a file or the input, line by line from the first line to the last searching for the lines that match the criteria you have entered.

Using awk

awk [ -F<char> ] {pgm} | { -f <pgm_file> } [ <vars> ] [ - | <data_file> ]

char:              field-separator character
pgm:             command-line program
pgm file:        file with awk program
vars:            variables
data file:      input data file

The -F allows awk to use a field separator character.  By default the blank space or tab is the field separator, but you can specify a character.  For example, if your fields were separated by “:” you can use:

awk -F:

When you send instructions to awk they may consist of patterns, actions or a combination of those.

Input from Files

Awk will review a file and search for a text string that you indicate.  In this example, a text file is created with the ps command and then awk searches for the specific text string “apache” in the created file.  The format here is:

awk ‘pattern’ filename

ps aux > processes
awk ‘/apache/’ processes
apache    2206  0.0  1.0 207376  3972 ?        S    22:20   0:00 /usr/sbin/httpd
apache    2207  0.0  1.0 207376  3972 ?        S    22:20   0:00 /usr/sbin/httpd
apache    2208  0.0  1.0 207376  3972 ?        S    22:20   0:00 /usr/sbin/httpd

In this example the format is using an action instead of a pattern.  The output is different because you get all of the first field “$1”.  The fields are separated by whitespace.
awk ‘action’ filename

awk ‘{print $1}’ processes

Now you can fix the output problem with a modification and use both a pattern and and action in the format.
awk  ‘pattern {action}’ filename

awk ‘/apache/{print $1}’ processes

Check out our self-directed Bash Shell Course.  This course provides a whole chapter on awk and how to use it effectively.

Go there...



Akhil said...

Nice Article.
Some examples on Find Command and Awk Command

Don's Deals Blog said...

Thanks Akhil, thanks for the links. Some good info there...