The Wayback Machine - https://web.archive.org/web/20110520232835/http://www.ibm.com:80/developerworks/aix/tutorials/au-gawk/section2.html
Skip to main content

If you don't have an IBM ID and password, register here.

The first time you sign into developerWorks, a profile is created for you. This profile includes the first name, last name, and display name you identified when you registered with developerWorks. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

All information submitted is secure.

Get started with GAWK: AWK language fundamentals

Begin learning AWK with the open source GAWK implementation

Michael Stutz, Author, Freelance Developer
Michael Stutz is author of The Linux Cookbook, which he also designed and typeset using only open source software. His research interests include digital publishing and the future of the book. He has used various UNIX operating systems for 20 years. You can reach him at stutz@dsl.org.

Summary:  Discover the basic concepts of the AWK text-processing and pattern-scanning language. This tutorial gets you started programming in AWK: You'll learn how AWK reads and sorts its input data, run AWK programs, manipulate data, and perform complex pattern matching. When you're finished, you'll also understand GNU AWK (GAWK).

Date:  19 Sep 2006
Level:  Intermediate PDF:  A4 and Letter (109 KB | 24 pages)Get Adobe® Reader®

Activity:  11003 views
Comments:  

Get ready to GAWK

Learn about the AWK programming language and the differences between implementations, and prepare your GAWK installation so that you can begin programming.

The AWK language

AWK is the name of the programming language itself, written in 1977. Its name is an acronym for the surnames of its three principal authors: Drs. A. Aho, P. Weinberger, and B. Kernighan.

Because AWK is a text-processing and pattern-matching language, it's often called a data-driven language -- the program statements describe the input data to match and process rather than a sequence of program steps, as is the case with many languages. An AWK program searches its input for records containing patterns, performing specified actions on that record until the program reaches the end of input. AWK programs are excellent for work on databases and tabular data, such as for pulling out columns from multiple data sets, making reports, or analyzing data. In fact, AWK is useful for writing short, one-off programs to perform some feat of text hackery that in another language might be overkill. In addition, AWK is often used on the command line or with pipelines as a power tool.

Like Perl -- which it inspired -- AWK is an interpreted language, so AWK programs are generally not compiled. Instead, the program scripts are passed to the AWK interpreter at run time.

Systems programmers find themselves immediately at home with the C-like syntax of AWK's input language. In fact, many of its features, including control statements and string functions, such as printf and sprintf, appear virtually identical. However, some differences do exist.

Versions of AWK

The AWK language was updated and more or less replaced in the mid-1980s with an enhanced version called NAWK (New AWK). The old AWK interpreter still exists on many systems, but it's often installed as the oawk (Old AWK) command, while the NAWK interpreter is installed as the main awk command as well as being available as nawk. Dr. Kernighan still maintains NAWK; like GAWK, it is open source and freely available (see Resources).

GAWK is the GNU Project's open source implementation of the AWK interpreter. While the early GAWK releases were replacements for the old AWK, it has since been updated to contain the features of NAWK.

In this tutorial, AWK always refers to references general to the language, while those features specific to the GAWK or NAWK implementations are referred to by their names. You'll find links to GAWK, NAWK, and other important AWK sites in the Resources section.

GAWK features and benefits

GAWK has the following unique features and benefits:

  • It is available for all major UNIX platforms as well as other operating systems, including Mac OS X and Microsoft® Windows®.
  • It is Portable Operating System Interface (POSIX) compliant and contains all features from the 1992 POSIX standard.
  • It has no predefined memory limits.
  • Helpful new built-in functions and variables are available.
  • It contains special regexp operators.
  • Record separators can contain regexp operators.
  • Special file support is available to access standard UNIX streams.
  • Lint checking is available.
  • It uses extended regular expressions by default.
  • It allows unlimited line lengths and continuations with the backslash character (\).
  • It has better, more informative error messages.
  • It includes TCP/IP networking functions.

Check your version

After you have installed GAWK, you must first determine where your local copy has been placed. Most systems use GAWK as their primary AWK install, such as /usr/bin/awk as a symbolic link to /usr/bin/gawk, so that awk is the name of the command for the GAWK interpreter. This tutorial assumes such an installation. On systems with another flavor of AWK already installed or taking precedence, you might have to call GAWK as gawk.

You'll know that you have everything installed correctly if you type awk and get the GNU usage screen, as shown in Listing 1. Most other flavors of AWK return nothing at all.


Listing 1. GAWK installed as awk
 $ awk Usage: gawk [POSIX or GNU style options] -f progfile [--] file ... Usage: gawk [POSIX or GNU style options] [--] 'program' file ... POSIX options: GNU long options: -f progfile --file=progfile -F fs --field-separator=fs -v var=val --assign=var=val -m[fr] val -W compat --compat -W copyleft --copyleft -W copyright --copyright -W dump-variables[=file] --dump-variables[=file] -W gen-po --gen-po -W help --help -W lint[=fatal] --lint[=fatal] -W lint-old --lint-old -W non-decimal-data --non-decimal-data -W profile[=file] --profile[=file] -W posix --posix -W re-interval --re-interval -W source=program-text --source=program-text -W traditional --traditional -W usage --usage -W version --version To report bugs, see node `Bugs' in `gawk.info', which is section `Reporting Problems and Bugs' in the printed version. 

As you can see, GAWK takes the GNU-standard option for getting the version. The output you get, including a notice from the Free Software Foundation concerning the licensing of GAWK and its lack of warranty, should look like Listing 2.


Listing 2. Displaying the GAWK version
 $ gawk --version GNU Awk 3.1.5 Copyright (C) 1989, 1991-2005 Free Software Foundation. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. $ 

Now that you have a working GAWK installation and you know how to call it, you're ready to begin programming. The next section describes basic AWK programming concepts.

2 of 9 | Previous | Next

Comments



Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=AIX and UNIX
ArticleID=160836
TutorialTitle=Get started with GAWK: AWK language fundamentals
publish-date=09192006
author1-email=stutz@dsl.org
author1-email-cc=

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers

close