v1.0.0 / 01 may 03 / greg goebel / public domain
* The UN*X operating system provides a flexible set of simple tools to allow you to perform a wide variety of system-management, text-processing, and general-purpose tasks. These simple tools can be used in very powerful ways by tying them together programmatically, using using "shell scripts" or "shell programs".
The UN*X "shell" itself is a user-interface program that accepts commands from the user and executes them. It can also accept the same commands written as a list in a file, along with various other statements that the shell can interpret to provide input, output, decision-making, looping, variable storage, option specification, and so on. This file is a shell program.
Shell programs are, like any other programming language, useful for some things but not for others. They are excellent for system-management tasks but not for general-purpose programming of any sophistication. Shell programs, though generally simple to write, are also tricky to debug and slow in operation.
There are three versions of the UN*X shell: the original "Bourne shell (sh)", the "C shell (csh)" that was derived from it, and the "Korn shell (ksh)" that is in predominant use.
This document focuses on the Bourne shell. The C shell is more powerful but has various limitations, and while the Korn shell is clean and more powerful than the other two shells, it is a superset of the Bourne shell: anything that runs on a Bourne shell runs on a Korn shell. Since the Bourne shell's capabilities are probably more than most people require, there's no reason to elaborate much beyond them in an introductory document, and the rest of the discussion will assume use of the Bourne shell unless otherwise stated.
* The first thing to do in understanding shell programs is to understand the elementary system commands that can be used in them. A list of fundamental UN*X system commands follows: ls # Give a simple listing of files. ll # Give a listing of files with file details. cp # Copy files. mv # Move or rename files. rm # Remove files. rm -r # Remove entire directory subtree. cd # Change directories. pwd # Print working directory. cat # Lists a file or files sequentially. more # Displays a file a screenfull at a time. pg # Variant on "more". mkdir # Make a directory. rmdir # Remove a directory.
* The shell also allows files to be defined in terms of "wildcard characters" that define a range of files. The "*" wildcard character substitutes for any string of characters, so: rm *.txt
rm book?.txt
rm *book?.txt
* Another shell capability is "input and output redirection". The shell accepts input by default from what is called "standard input", and generates output by default to what is called "standard output". These are normally defined as the keyboard and display, respectively, or what is referred to as the "console" in UN*X terms.
However, you can "redirect" standard input or output to a file or another program if needed. Consider the "sort" command. This command sorts a list of words into alphabetic order, so if you enter: sort PORKY ELMER FOGHORN DAFFY WILE BUGS <CTL-D>
BUGS DAFFY ELMER FOGHORN PORKY WILE
sort < names.txt
sort < names.txt > output.txt
sort < names.txt >> output.txt
sort < names.txt | tee output.txt
By the way, "sort" has some handy additional options: sort -u # Eliminate redundant lines in output. sort -r # Sort in reverse order. sort -n # Sort numbers. sort +1 # Skip first field in sorting.
ls xyzzy 2> /dev/null
* The shell allows you to execute multiple commands sequentially on one line by chaining them with a ";": rm *.txt ; ls
sort < bigfile.txt > output.txt &
You instruct the shell that the file contains commands by marking it as "executable" with the "chmod" command. Each file under UN*X has a set of "permission" bits, listed by an "ll" as: rwxrwxrwx
You can use "chmod" to set these permissions by specifying them as an octal code. For example: chmod 644 myfile.txt
chmod +x mypgm
For example, suppose you want to be able to inspect the contents of a set of archive files stored in the directory "/users/group/archives". You could create a file named "ckarc" and store the the following command string in it: ls /users/group/archives | pg
The following sections describe these features in a quick outline fashion. Please remember that you can't really make much effective use of most of the features until you've learned about all of them, so if you get confused just keep on going and then come back for a second pass.
* The first useful command to know about in building shell programs is "echo", which allows you to perform output from your shell program: echo "This is a test!"
The shell allows you to store values in variables. All you have to do to declare a variable is assign a value to it: shvar="This is a test!"
echo $shvar
ll $lastdir
$shvar=""
allfiles=*
echo $allfiles
Another subtlety is in modifying the values of shell variables. Suppose you have a file name in a shell variable named "myfile" and want to copy that file to another with the same name, but with "2" tacked on to the end. You might think to try: mv $myfile $myfile2
mv $myfile ${myfile}2
If you want to call other shell programs from a shell program and have them use the same shell variables as the calling program, you have to "export" them as follows: shvar="This is a test!" export shvar echo "Calling program two." shpgm2 echo "Done!"
echo $shvar
The next step is to consider shell command substitution. Like any programming language, the shell does exactly what you tell it to do and so you have to be very specific when you tell it to do something.
As an example, consider the "fgrep" command, which searches a file for a string. Suppose you want to search a file named "source.txt" for the string "Coyote". You could do this with: fgrep Coyote source.txt
fgrep Wile E. Coyote source.txt
fgrep "Wile E. Coyote" source.txt
For example, if you executed: echo "$shvar"
echo '$shvar'
* Having considered "double-quoting" and "single-quoting", let's now consider "back-quoting". This is a little tricky to explain. As a useful tool, consider the "expr" command, which allows you to do simple math from the command line: expr 2 + 4
expr 3 \* 7
echo $shcmd
echo "$shcmd"
echo '$shcmd'
echo `$shcmd`
* In general, shell programs operate in a "batch" mode, that is, without interaction from the user, and so most of their parameters are obtained on the command line.
Each argument on the command line can be seen inside the shell program as a shell variable of the form "$1", "$2", "$3", and so on, with "$1" corresponding to the first argument, "$2" the second, "$3" the third, and so on.
There is also a "special" argument variable, "$0", that gives the name of the shell program itself. Other special variables include "$#", which gives the number of arguments supplied, and "$*", which gives a string with all the arguments supplied.
Since the argument variables are in the range "$1" to "$9", so what happens if you have more than 9 arguments? No problem, you can use the "shift" command to move the arguments down through the argument list. That is, when you execute "shift" then the second argument becomes "$1", the third argument becomes "$2", and so on, and if you do a "shift" again the third argument becomes "$1"; and so on. You can also add a count to cause a multiple shift: shift 3
* Shell programs can perform conditional tests on their arguments and variables and execute different commands based on the results. For example: if [ "$1" = "hyena" ] then echo "Sorry, hyenas not allowed." exit elif [ "$1" = "jackal" ] then echo "Jackals not welcome." exit else echo "Welcome to Bongo Congo." fi echo "Do you have anything to declare?"
There are a wide variety of such test conditions: [ "$shvar" = "fox" ] String comparison, true if match. [ "$shvar" != "fox" ] String comparison, true if no match. [ "$shvar" = "" ] True if null variable. [ "$shvar" != "" ] True if not null variable. [ "$nval" -eq 0 ] Integer test; true if equal to 0. [ "$nval" -ge 0 ] Integer test; true if greater than or equal to 0. [ "$nval" -gt 0 ] Integer test; true if greater than 0. [ "$nval" -le 0 ] Integer test; true if less than or equal to 0. [ "$nval" -lt 0 ] Integer test; true if less than to 0. [ "$nval" -ne 0 ] Integer test; true if not equal to 0. [ -d tmp ] True if "tmp" is a directory. [ -f tmp ] True if "tmp" is an ordinary file. [ -r tmp ] True if "tmp" can be read. [ -s tmp ] True if "tmp" is nonzero length. [ -w tmp ] True if "tmp" can be written. [ -x tmp ] True if "tmp" is executable.
case "$1" in "gorilla") echo "Sorry, gorillas not allowed." exit;; "hyena") echo "Hyenas not welcome." exit;; *) echo "Welcome to Bongo Congo.";; esac
* The fundamental loop construct in the shell is based on the "for" command. For example: for nvar in 1 2 3 4 5 do echo $nvar done
for file in * do echo $file done
for file do echo $file done
for file do if [ "$file" = punchout ] then break else echo $file fi done
then : else
n=10 while [ "$n" -ne 0 ] do echo $n n=`expr $n - 1` done
n=10 until [ "$n" -eq 0 ] do ...
* There are other useful features available for writing shell programs. For example, you can comment your programs by preceding the comments with a "#": # This is an example shell program. cat /users/group/grouplog.txt | pg # Read group log file.
This will prevent prevent confusion if you find copies of the same file that don't have the same comments, or try to modify the program later. Shell programs can be obscure, even by the standards of programming languages, and it is useful to provide a few hints.
* You can read standard input into a shell program using the "read" command. For example: echo "What is your name?" read myname echo myname
* If you have a command too long to fit on one line, you can use the line continuation character "\" to put it on more than one line: echo "This is a test of \ the line continuation character."
. mycmds
* If you want to trace the execution of a shell program, you can use the "-x" option with the shell: sh -x mypgm *
* One last comment on shell programs before proceeding: What happens if you have a shell program that just performs, say: cd /users/coyote
The reason is that the shell creates a new shell, or "subshell", to run the shell program, and when the shell program is finished, the subshell vanishes -- along with any changes made in that subshell's environment. It is easier, at least in this simple case, to define a command alias in your UN*X "login" shell rather than struggle with the problem in shell programs.
* Before we go on to practical shell programs, let's consider a few more useful tools.
The "paste" utility takes a list of text files and concatenates them on a line-by-line basis. For example: paste names.txt phone.txt > data.txt
* The "head" and "tail" utilities list the first 10 or last 10 lines in a file respectively. You can specify the number of lines to be listed if you like: head -5 source.txt # List first 5 lines. tail -5 source.txt # List last 5 lines. tail +5 source.txt # List all lines after line 5.
tr '[A-Z]' '[a-z]' < file1.txt > file2.txt
tr '[a-z]' '[A-Z]' < file1.txt > file2.txt
tr -d '*'
* The "uniq" utility removes duplicate consecutive lines from a file. It has the syntax: uniq source.txt output.txt
* The "wc (word count)" utility tallies up the characters, words, and lines of text in a text file. You can also invoke it with the following options: wc -c # Character count only. wc -w # Word count only. wc -l # Line count only.
find / -name findtest.txt -print
There are a wide variety of selection criteria. If you just want to print out the directories in a search from your current directory, you can do so with: find . -type d -print
One of the things that makes "find" extremely useful is that not only can you perform searches, you can perform an action when a search has a match, using the "-exec" option. For example, if you want to get the headers of all the files on a match into a single file, you could do so as: find . -name log.txt -exec head >> ./log \;
* An advanced set of tools allows you to perform searches on text strings in files and, in some cases, manipulate the strings found. These tools are known as "grep", "sed", and "awk" and are based on the concept of a "regular expression", which is a scheme by which specific text patterns can be specified by a set of special or "magic" characters.
The simplest regular expression is just the string you are searching for. For example: grep Taz *.txt
But using the magic characters provides much more flexibility. For example: grep ^Taz *.txt
grep Taz$ *.txt
Now suppose you want to be able to match both "Taz" and "taz". You can do that with: [Tt]az
group_[abcdef]
group_[a-f]
set[0123456789]
set[0-9]
unit_[^xyz]
Other magic characters provide a wildcard capability. The "." character can substitute for any single character, while the "*" substitutes for zero or more repetitions of the preceding regular expression. For example: _*$
test\.txt
* Now that we understand regular expressions, we can consider "grep", "sed", and "awk" in more detail.
The name "grep" stands for "general regular expression processor" and as noted it searchs a file for matches to a regular expression like "^Taz" or "_*$". It has a few useful options as well. For example: grep -v <regular_expression> <file_list>
grep -n # List line numbers of matches. grep -i # Ignore case. grep -l # Only list file names for a match.
* The name "sed" stands for "stream editor" and it provides, in general, a search-and-replace capability. Its syntax for this task is as follows: sed 's/<regular_expression>/<replacement_string>/[g]' source.txt
For example, to replace the string "flack" with "flak", you would use "sed" as follows: sed 's/flack/flak/g' source.txt > output.txt
sed 's/bozo/d'
sed -f sedcmds.txt source.txt > output.txt
sed '/^Target/q' source.txt > output.txt
sed '/^Target/ r newtext.txt' source.txt > output.txt
* Finally, "awk" is a full-blown text processing language that looks something like a mad cross between "grep" and "C". In operation, "awk" takes each line of input and performs text processing on it. It recognizes the current line as "$0", with each word in the line recognized as "$1", "$2", "$3", and so on.
This means that: awk '{ print $0,$0 }' source.txt
awk '/Taz/ { taz++ }; END { print taz }' source.txt
You can do very simple or very complicated things with "awk" once you know how it works. Its syntax is much like that of "C", though it is much less finicky to deal with. Details of "awk" are discussed in another Vectorsite document.
* The most elementary use of shell programs is to reduce complicated command strings to simpler commands and to provide handy utilities.
For example, I can never remember the options for compiling an ANSI C program, so I store them in a script program named "compile": cc $1.c -Aa -o $1
date +"date: %A, %d %B %Y %H%M %Z"
date: Friday, 24 November 1995 1340 MST
for file do mv $file `echo $file | tr "[A-Z]" "[a-z]"` done
* This final section provides a fast lookup reference for the materials in this document. It is a collection of thumbnail examples and rules that will be cryptic if you haven't read through the text.
* Useful commands: cat # Lists a file or files sequentially. cd # Change directories. chmod +x # Set execute permissions. chmod 666 # Set universal read-write permissions. cp # Copy files. expr 2 + 2 # Add 2 + 2. fgrep # Search for string match. grep # Search for string pattern matches. grep -v # Search for no match. grep -n # List line numbers of matches. grep -i # Ignore case. grep -l # Only list file names for a match. head -5 source.txt # List first 5 lines. ll # Give a listing of files with file details. ls # Give a simple listing of files. mkdir # Make a directory. more # Displays a file a screenfull at a time. mv # Move or rename files. paste f1 f2 # Paste files by columns. pg # Variant on "more". pwd # Print working directory. rm # Remove files. rm -r # Remove entire directory subtree. rmdir # Remove a directory. sed 's/txt/TXT/g' # Scan and replace text. sed 's/txt/d' # Scan and delete text. sed '/txt/q' # Scan and then quit. sort # Sort input. sort +1 # Skip first field in sorting. sort -n # Sort numbers. sort -r # Sort in reverse order. sort -u # Eliminate redundant lines in output. tail -5 source.txt # List last 5 lines. tail +5 source.txt # List all lines after line 5. tr '[A-Z]' '[a-z]' # Translate to lowercase. tr '[a-z]' '[A-Z]' # Translate to uppercase. tr -d '_' # Delete underscores. uniq # Find unique lines. wc # Word count (characters, words, lines). wc -w # Word count only. wc -l # Line count.
shvar="Test 1" # Initialize a shell variable. echo $shvar # Display a shell variable. export shvar # Allow subshells to use shell variable. mv $f ${f}2 # Append "2" to file name in shell variable. $1, $2, $3, ... # Command-line arguments. $0 # Shell-program name. $# # Number of arguments. $* # Complete argument list. shift 2 # Shift argument variables by 2. read v # Read input into variable "v". . mycmds # Execute commands in file.
if [ "$1" = "red" ] then echo "Illegal code." exit elif [ "$1" = "blue" ] then echo "Illegal code." exit else echo "Access granted." fi [ "$shvar" = "red" ] String comparison, true if match. [ "$shvar" != "red" ] String comparison, true if no match. [ "$shvar" = "" ] True if null variable. [ "$shvar" != "" ] True if not null variable. [ "$nval" -eq 0 ] Integer test; true if equal to 0. [ "$nval" -ge 0 ] Integer test; true if greater than or equal to 0. [ "$nval" -gt 0 ] Integer test; true if greater than 0. [ "$nval" -le 0 ] Integer test; true if less than or equal to 0. [ "$nval" -lt 0 ] Integer test; true if less than to 0. [ "$nval" -ne 0 ] Integer test; true if not equal to 0. [ -d tmp ] True if "tmp" is a directory. [ -f tmp ] True if "tmp" is an ordinary file. [ -r tmp ] True if "tmp" can be read. [ -s tmp ] True if "tmp" is nonzero length. [ -w tmp ] True if "tmp" can be written. [ -x tmp ] True if "tmp" is executable.
case "$1" in "red") echo "Illegal code." exit;; "blue") echo "Illegal code." exit;; *) echo "Access granted.";; esac
for nvar in 1 2 3 4 5 do echo $nvar done for file # Cycle through command-line arguments. do echo $file done while [ "$n" != "Joe" ] # Or: until [ "$n" = "Joe" ] do echo "What's your name?" read n echo $n done
* This document was originally written during the 1990s, but I yanked it in 2001 as it didn't seem to be attracting much attention. In the spring of 2003 I retrieved it and put it back up as I realized it offered some value and it made no sense just to keep it archived, gathering dust.
Unfortunately, by that time I had lost track of its revision history. For want of anything better to do, I simply gave the resurrected document the initial revcode of "v1.0.0". I believe it is unlikely that any earlier versions of this document are available on the Internet, but since I had switched from a two-digit revcode format ("v1.0") to a three-digit format ("v1.0.0") in the interim, any earlier copies will have a two-digit revcode.
* Revision history: v1.0.0 / gvg / 01 may 03