Introduction to Bash Programming
Take time for this week’s reading; it’s short, important, and useful.
We are now familiar with the shell and a few commands.
In this lecture, we discuss shell programming using bash. The main goal is to write your own scripts. But what are scripts? And what are they useful for?
Goals
We learn the following today:
- Understanding shell script syntax and constructs
- Writing simple interactive scripts
- Writing and executing your first shell script
- Understanding more advanced constructs through examples
Class scripts
I recorded my Terminal windows today on Mac and flume.
Interactive mode and shell scripts
The shell can be used in two different ways:
- interactive mode, which allows you to enter more than one command interactively to the shell. We have been doing this already. However, interactive mode allows you to enter a series of commands; and
- shell scripts, in which the shell reads commands a series of commands (or complex programs) from a text file.
The interactive mode is fine for entering a handful of commands but it becomes cumbersome for the user to keep re-entering these commands interactively. It is better to store the commands in a text file called a shell script, or script for short, and execute the script when needed. In this way, the script is preserved so you and other users can use it again.
In addition to calling Unix commands (e.g., grep
, cd
, rm
) shell scripts can also invoke compiled programs (e.g., C programs) and other shell scripts.
Shell programming also includes control-flow commands to test conditions (if...then
) or to do a task repeatedly (for...do
).
These control structure commands found in many other languages (such as C, or other scripting languages like python) allow the programmer to quickly write fairly sophisticated shell programs to do a number of different tasks.
Like python, and unlike C or Java, shell scripts are not compiled; rather, they are interpreted and executed by the shell itself.
Shell scripts are used for many reasons - building and configuring systems or environments, prototyping code, or an array of repetitive tasks that programmers do. Shell programming is mainly built on the Unix shell commands and utilities; reuse of these existing programs enables programmers to simply build new programs to tackle fairly complex jobs.
Separating groups of commands using ‘;’
Let’s start to build up our knowledge of how scripts work by first looking at some basic operations of the shell. The Unix shell allows for the unconditional execution of commands and allows for related commands to be kept adjacent as a command sequence using the semicolon character as shown below:
[cs50@flume ~]$ echo Directory listing; date; ls
Directory listing
Fri Apr 1 08:58:08 EDT 2016
Archive/ primes private/ public_html/ resources/ students web@ ziplab1.sh*
[cs50@flume ~]$
Exit status - who cares?
When using the shell interactively it is often clear when we have made a mistake - the shell warns about incorrect syntax, and complains about invalid switches or missing files.
These warnings and complaints can come from the shell’s parser and from the program being run (for example, from ls
).
Error messages provide visual clues that something is wrong, allowing us to adjust the command to get it right.
Commands also inform the shell explicitly whether the command has terminated successfully or unsuccessfully due to some error. Commands do this by returning an exit status, which is represented as an integer value made available to the shell and other commands, programs, and scripts.
The shell understands an exit status of 0
to indicate successful execution, and any other value (always positive) to indicate failure of some sort.
The shell environment value $?
is updated each time a command exits.
What do we mean by that?
[cs50@flume ~]$ echo April Fool
April Fool
[cs50@flume ~]$ echo $?
0
[cs50@flume ~]$ ls April Fool
ls: cannot access April: No such file or directory
ls: cannot access Fool: No such file or directory
[cs50@flume ~]$ echo $?
2
[cs50@flume ~]$
Conditional sequences - basic constructs
Why do we need to use the exit status?
Often we want to execute a command based on the success or failure of an earlier command. For example, we may only wish to remove files if we are in the correct directory, or perhaps we want to be careful to only append info to a file if we know it already exists.
The shell provides both conjunction (and) and disjunction (or) based on previous commands. These are useful constructs for writing decision-making scripts. Take a look at the example below in which we make three directories, then try to remove the first:
[cs50@flume ~]$ mkdir labs && mkdir labs/lab1 labs/labs2
[cs50@flume ~]$ rmdir labs || echo whoops!
rmdir: failed to remove `labs': Directory not empty
whoops!
[cs50@flume ~]$
In the first example, &&
(without any spaces) specifies that the second command should be only executed if the first command succeeds (with an exit status of 0
) - i.e., we only make the subdirectories if we can make the top directory.
In the second example, (||
) (without any spaces) requests that the second command is only executed if the first command failed (with an exit status other than 0
).
Conditional execution using if, then, else
There are many situations when we need to execute commands based on the outcome of an earlier command.
if command0; then
command1
command2
fi
Here command1
and command2
(and any other commands that might be input) will be executed if and only if command0
returns a successful or true value (i.e., its exit status is 0
).
The fact that
0
means true is confusing for many people! (In many high-level languages - like C - zero means false and non-zero means true; technology isn’t always consistent.) The reason Unix uses0
for success is that there is only one0
, but there are many non-zero numbers; thus,0
implies ‘all is well’ whereas non-zero implies ‘something went wrong’, and the specific non-zero value can convey information about what went wrong.
Similarly, we may have commands to execute if the conditional fails.
if command0; then
command1
command2
else
command3
command4
fi
Here command3
and command4
will be executed if and only if command0
fails.
First Interactive Shell Program
Entering interactive scripts - that is, a tiny sequence of commands, typed at the keyboard in an interactive shell - is an easy way to get the sense of a new scripting language or to try out a set of commands. During an interactive session the shell simply allows you to enter an ‘one-command’ interactive program at the command line and then executes it.
[cs50@flume ~]$ if cp students students.bak
> then
> echo $? copy succeeded!
> else
> echo $? copy failed!
> fi
0 copy succeeded!
[cs50@flume ~]$
The >
character is the secondary prompt, issued by the shell indicating that more input is expected.
The exit status of the cp
command is used by the shell to decide whether to execute the then
clause or the else
clause.
Just for yucks, I had echo
show us the exit status $?
; the above example confirms that 0
status means ‘true’ and triggered the then
clause.
We can invert the conditional test by preceding it with !
, as in many programming languages:
[cs50@flume ~]$ if ! cp students students.bak
> then
> echo copy failed!
> fi
[cs50@flume ~]$
Astute readers might note that I did not quote or escape the
!
in the echo commands. I’ve noticed that the!
is not special if it comes last, which is handy for writing interjections!
The command0
can actually be a sequence or pipeline.
The exit status of the last command is used to determine the conditional outcome.
[cs50@flume ~]$ if mkdir backup && cp students backup/students
> then
> echo backup success
> else
> echo backup failed
> fi
backup success
[cs50@flume ~]$
In the above example, then
was on the next line instead of at the end of the if
line.
That’s a stylistic choice; if you want it on the if
line you simply need to put a semicolon (;
) after the if
condition and before the word then
, as seen in the earlier examples.
The test, aka [ ] command
The command0
providing the exit status need not be an external command.
We can test for several conditions using the built-in test
or (interchangeably) the [ ]
command.
We use both below but we recommend you use the [ ]
version of the test command because (a) it is more readable and (b) it’s more commonly used.
Suppose I want to backup students
only if it exists; the -f
switch tests whether the following filename names an existing file.
[cs50@flume ~]$ if test -f students
> then
> mkdir backup && cp students backup/students || echo backup failed
> fi
[cs50@flume ~]$
Rewritten with [ ]
,
[cs50@flume ~]$ if [ -f students ]
> then
> mkdir backup && cp students backup/students || echo backup failed
> fi
[cs50@flume ~]$
More commonly, the if
and then
are written on the same line, using semicolon:
[cs50@flume ~]$ if [ -f students ] ; then
> mkdir backup && cp students backup/students || echo backup failed
> fi
[cs50@flume ~]$
Note, it’s important that you leave spaces around the brackets or you will get syntax errors.
There are other options that can be used with the [ ]
command.
Option Meaning
-e does the file exist?
-d does the directory exist?
-f does the file exist and is it an ordinary file (not a directory)?
-r does the file exist and is it readable?
-s does the file exist and have a size greater than 0 bytes
-w does the file exist and is it writeable?
-x does the file exist and is it executable?
To learn even more about the test
command, man test
.
Loops for lists
Many commands accept a list of files on the command line and perform actions on each file in turn. However, what if we need to perform a sequence of commands on each file in the list of files? Some commands can only handle one file (or argument) per invocation so we need to invoke the command many times.
The shell supports a simple iteration over lists of values - typically over lists of filenames.
In the following example, we make a ‘back up’ copy of each of our C files by appending the .bak
extension.
(Again, this extension is just a naming convention - Unix doesn’t care, nor does the shell.)
[cs50@flume ~/example]$ ls
hash.c hash.c.date makefile output.data queue.c README sort.c
[cs50@flume ~/example]$ for i in *.c
> do
> echo back up $i
> cp $i $i.bak
> done
back up hash.c
back up queue.c
back up sort.c
[cs50@flume ~/example]$ ls
hash.c hash.c.date output.data queue.c.bak sort.c
hash.c.bak makefile queue.c README sort.c.bak
[cs50@flume ~/example]$
Notice that the variable i
is instantiated, one at a time, with the value of each argument in the list provided after in
, and that value is substituted wherever $i
occurs.
We should be more defensive, though, in case one of the filenames has a space inside it;
[cs50@flume ~/example]$ for i in *.c
> do
> echo back up "$i"
> cp "$i" "$i.bak"
> done
back up hash.c
back up queue.c
back up sort.c
[cs50@flume ~/example]$
As expected we may place as many commands as we want inside the body of a loop. We can use any combination of other if/else tests and nested loops, just like in traditional languages such as C.
We are not limited to use names of files (as generated by filename expansion) in our list:
[cs50@flume ~]$ for house in Allen "East Wheelock" "North Park" School South West LLC
> do
> echo $house is the best house!
> done
Allen is the best house!
East Wheelock is the best house!
North Park is the best house!
School is the best house!
South is the best house!
West is the best house!
LLC is the best house!
[cs50@flume ~]$
We can use the contents of a file to provide the list used by for
:
[cs50@flume ~]$ cat LFlist
John.P.Kotz.19@dartmouth.edu
joel.j.katticaran.ug@dartmouth.edu
Kaya.M.Thomas.17@dartmouth.edu
trevor.l.davis.18@dartmouth.edu
Thomas.D.Kim.19@dartmouth.edu
kyle.dotterrer.18@dartmouth.edu
[cs50@flume ~]$ for i in $(<LFlist) ; do echo hello "$i" ; done
hello John.P.Kotz.19@dartmouth.edu
hello joel.j.katticaran.ug@dartmouth.edu
hello Kaya.M.Thomas.17@dartmouth.edu
hello trevor.l.davis.18@dartmouth.edu
hello Thomas.D.Kim.19@dartmouth.edu
hello kyle.dotterrer.18@dartmouth.edu
[cs50@flume ~]$
Notice the special shell syntax $(<filename)
, which means to substitute the contents of filename
.
Any spaces or newlines in the file will cause the shell to delineate words that become arguments to for
.
The example also demonstrates how one can use semicolons to write a simple loop all on one line!
In fact, if you type a multi-line
if
orfor
statement, then execute it, and later use up-arrow (or ctrl-P) to have the shell retrieve your earlier command, you’ll see that it formats it this way.
We can even use the output of a command to provide the list used by for
:
[cs50@flume ~]$ for i in $(sed 's/\..*/!/' LFlist | sort); do echo hello $i; done
hello John!
hello Kaya!
hello Thomas!
hello joel!
hello kyle!
hello trevor!
[cs50@flume ~]$
Indeed, in this case, we’ve used a pipeline of two commands to produce the list of arguments to for
.
You may see old scripts (or old people!) using the old-fashioned syntax in which the command is surrounded by back-quotes,
`command`
, instead of$(command)
; the latter is arguably more readable and, sometimes, nestable.
First Shell Script
Up until now we have entered scripts interactively into the shell. It is a pain to have to keep re-entering scripts interactively. It is better to store the script commands in a text file and then execute the script when we need it. So how do we do that?
Simple! Write the commands in a file, and ask bash
to read commands from the file instead of from the keyboard.
For example, we can put our simple backup script into a file called backup.sh
:
[cs50@flume ~/example]$ cat > backup.sh
for i in *.c
do
echo back up "$i"
cp "$i" "$i.bak"
done
[cs50@flume ~/example]$ bash backup.sh
back up hash.c
back up queue.c
back up sort.c
[cs50@flume ~/example]$
Here I’ve typed it at the keyboard, but for more complex scripts, you would of course want to use a text editor.
Indeed, we can go further, and make the file into a command executable at the shell prompt; to do so, you should
- add a special string
#!/bin/bash
to the first line, - make it executable (with
chmod
), and - either
- add it to a directory on our
PATH
, or - type its pathname at the commandline.
- add it to a directory on our
So, for backup.sh
, it looks like this:
[cs50@flume ~/example]$ emacs backup.sh
[cs50@flume ~/example]$ cat backup.sh
#!/bin/bash
for i in *.c
do
echo back up "$i"
cp "$i" "$i.bak"
done
[cs50@flume ~/example]$ chmod +x backup.sh
[cs50@flume ~/example]$ ls -l backup.sh
-rwxr-xr-x 1 cs50 cs50 72 Apr 3 15:28 backup.sh*
[cs50@flume ~/example]$ ./backup.sh
back up hash.c
back up queue.c
back up sort.c
[cs50@flume ~/example]$
There are a couple of things to note about this example.
First, there is the #!/bin/bash
line.
What does this mean?
Typically, the #
in the first column of a file denotes the start of a comment until the end of the line.
Indeed, in this case, this line is treated as a comment by bash
.
Unix, however, reads that line when you execute the file and uses it to determine which command should be fed this file; thus, in effect, Unix will execute /bin/bash ./backup.sh
.
Then bash
reads the file and interprets its commands.
The #!/bin/bash
must be the first line of the file, exactly like that - no spaces.
Second, there is chmod +x
, which sets the ‘execute’ permission on the file.
(Notice the ‘x’ characters in the file permissions displayed by ls
.) Unix will not execute files that do not have ‘execute’ permission, and the shell won’t even try.
Third, we used the pathname ./backup.sh
when treating our script as a command, because .
is not on our PATH
.
If .
were on our PATH
, we could have typed just backup.sh
.
It is very tempting to have
.
on yourPATH
, but it is a big security risk. If youcd
to a directory with an executable file called, say,ls
and you don’t notice, bad things might happen when you type the commandls
. If.
is on yourPATH
before/bin
you will run the local command./ls
instead of the official/bin/ls
… and the localls
may be malicious and do something bad!
Fourth, this script has no comments. We really should improve it; see backup.sh.
#!/bin/bash
#
# backup.sh - make a backup copy of all the .c files in current directory
#
# usage: backup.sh
# (no arguments)
#
# input: none
# output: a line of confirmation for each file backed up
#
# David Kotz, April 2016
for i in *.c
do
echo back up "$i"
cp "$i" "$i.bak"
done
exit 0
It is good practice to identify the program, how its command-line should be used, and a description of what it does (if anything) with stdin and stdout. And to list the author name(s) and date.
Notice the script returns the exit status 0
, which can be viewed using the echo $?
command, as discussed earlier.
The return status is typically not checked when scripts are run from the command line.
However, when a script is called by another script the return status is typically checked - so it is important to return a meaningful exit status.
We could continue to improve this script - for example, to catch errors from cp
and do something intelligent, but let’s move on.
Another shell script
If we go to all that trouble to save a backup copy of our .c
files, it might be nice to see, later, what changed since we last made a backup.
Let’s write a little script to compare the current versions with the backup copies.
See backup-diff.sh
#!/bin/bash
#
# backup-diff.sh - compare all the .c files in current directory with backup
#
# usage: backup-diff.sh
# (no arguments)
#
# input: none
# output: a line of information for each file, and diffs where they differ
# exit status: zero.
#
# David Kotz, March 2017
for i in *.c
do
if [ ! -r "$i.bak" ]
then
echo "$i" - no backup
else
if cmp --quiet "$i.bak" "$i"
then
echo "$i" unchanged
else
echo "$i" differences:
diff "$i.bak" "$i"
fi
fi
echo
done
exit 0
Classroom activity: modify this script to exit non-zero when missing backups or differences are found.
Variables and arrays
Variables are typically not declared before they are used in scripts.
[cs50@flume ~]$ a=5
[cs50@flume ~]$ message="good morning"
[cs50@flume ~]$ echo $a
5
[cs50@flume ~]$ echo $message
good morning
[cs50@flume ~]$
[cs50@flume ~]$ echo ${message}
good morning
Above we create two variables (a
and message
).
The later commands show the ${varname}
syntax for variable substitution; this is the general form whereas $varname
is a shorthand that works for simple cases; note that ${message}
is identical to $message
.
Repetition: the while Command
The ‘for-loop’ construct is good for looping through a series of strings but not that useful when you do not know how many times the loop needs to run.
The while do
command is perfect for this.
The contents of guessprime.sh use the ‘while-do’ construct. The script allows the user to guess a prime between 1-100.
#!/bin/bash
#
# File: guessprime.sh
#
# Description: The user tries to guess a prime between 1-100
# This is not a good program. There is no check on what the
# user enters; it may not be a prime, or might be outside the range.
# Heck - it might not even be a number and might be empty!
# Some defensive programming would check the input.
#
# Input: The user guess a prime and enters it
#
# Output: Status on the guess
# Program defines a variable called prime and set it to a value.
prime=31
echo -n "Enter a prime between 1-100: "
read guess
while [ $guess != $prime ]; do
echo "Wrong! try again"
echo -n "Enter a prime between 1-100: "
read guess
done
exit 0
This script uses user defined variables prime
and guess
.
It introduces the read
command, which pauses and waits for user input, placing that user input into the named variable.
The -n
switch to echo
removes the newline usually produced by echo.
Finally, note the semicolon after the while
command and before the do
command.
As with the if
command and its then
branch, we could have put do
on the next line if we prefer that style.
[cs50@flume ~/public_html/examples]$ ./guessprime.sh
Guess a prime between 1-100: 33
Wrong! try again
Guess a prime between 1-100: 2
Wrong! try again
Guess a prime between 1-100: 9
Wrong! try again
Guess a prime between 1-100: 31
[cs50@flume ~/public_html/examples]$
The shell’s variables
The shell maintains a number of important variables that are useful in writing scripts. We have come across some of them already.
Variable Description
$USER username of current user
$HOME pathname for the home directory of current user
$PATH a list of directories to search for commands
$# number of parameters passed to the script
$0 name of the shell script
$1, $2, .. $# arguments given to the script
$* A list of all the parameters in a single variable.
$@ A list of all the parameters in a single variable; always delimited
$$ process ID of the shell script when running
The variable $#
tells you how many arguments were on the command line; if there were three arguments, for example, they would be available as $1
, $2
, and $3
.
In the command line myscript.sh a b c
, then, $#=3
, $0=myscript.sh
, $1=a
, $2=b
, and $3=c
.
The two variables $*
and $@
both provide the list of command-line arguments, but with subtle differences; try the following script, args.sh, to see the difference.
#!/bin/bash
echo $# arguments to $0
# loop through all the arguments, in four different waysf
echo 'for arg in $*'
for arg in $*; do echo "$arg"; done
echo
echo 'for arg in "$*"'
for arg in "$*"; do echo "$arg"; done
echo
echo 'for arg in $@'
for arg in $@; do echo "$arg"; done
echo
echo 'for arg in "$@"'
for arg in "$@"; do echo "$arg"; done
exit 0
Let’s try it on a command with four arguments; the fourth argument has an embedded space.
[cs50@flume ~/public_html/examples]$ ./args.sh one two three "and more"
4 arguments to ./args.sh
for arg in $*
one
two
three
and
more
for arg in "$*"
one two three and more
for arg in $@
one
two
three
and
more
for arg in "$@"
one
two
three
and more
[cs50@flume ~/public_html/examples]$
Study the difference of each case.
You should use "$@"
to process command-line arguments, nearly always, because it retains the structure of those arguments.
As a shorthand, for arg
is equivalent to for arg in "$@"
.
My choice of the variable name
arg
is immaterial to the shell.
Printing error messages
You might need to inform the user of an error; in this example, the 2nd argument is supposed to be a directory and the script found that it is not:
echo 1>&2 Error: "$2" should be a directory
Here we see how to push the output of echo
, normally to stdout (1
), to the stderr (2
) instead, by redirecting the stdout to the stderr using the confusing but useful redirect 1>&2
, which means ‘make the stdout go to the same place as the stderr’.
Checking arguments
When writing scripts it is important to write defensive code that checks whether the input arguments are correct. Below, the program verifies that the command has exactly three arguments, using the ‘not equal to’ operator.
if [ $# -ne 3 ]; then
echo 1>&2 Usage: incorrect argument input
exit 1
fi
Notice also that the script then exits with a non-zero status.
Finally
From this week’s reading assignments:
- Comments should clarify the code, not obscure it.
- They should enlighten, not impress.
- If you used a special algorithm or text, mention it and give a reference!
- Don’t just add noise or chitchat.
- Say in comments what the code cannot.
Don’t forget there are some good bash
references on the Resources page.
Other stuff
There’s never enough time to show you all the good stuff in class.
Simple debugging tips
When you run a script you can use printf
or echo
to print debugging information to the screen.
I found it helpful to define a function debugPrint
so I can turn on and off all my debug statements in one place:
# print the arguments for debugging; comment-out 'echo' line to turn it off.
function debugPrint() {
# echo "$@"
return
}
...
debugPrint starting to process arguments...
for arg; do
debugPrint processing "$arg"
...
If you get a syntax error; for example:
[cs50@flume ~]$ ./ziplab1.sh
making a tarball called cs50-lab1.tgz
./ziplab1.sh: line 18: syntax error near unexpected token `else'
./ziplab1.sh: line 18: `else'
[cs50@flume ~]$
The error is on or around line 18.
In emacs
edit the file ./count.sh again and then go to line 18 using the sequence of key strokes ESC g
– that is, hit the ESC
key and hit g
.
(If you did not install the customized ~cs50/.emacs
file in your own ~/.emacs
, you may need to hit g
twice.) Then, enter the line number 18 and you will be brought to that line.
Now fix the bug.
(In my particular example, the actual error was on line 13, not 18; on line 13 the if
statement began, but I forgot the semicolon before then
…
the shell finally realized a problem when it reached the else
command at line 18.
So you may need to work backwards through the code, looking carefully to find the syntax problem.
Every time you launch emacs
to edit a file, it saves a backup copy of that file.
For example, when you edit foo.sh
and save it, emacs
saves the pre-editing version in foo.sh~
.
If you’re later wondering what changed,
[cs50@flume ~]$ diff foo.sh~ foo.sh
will print the differences between the two files.
Arrays
Like variables, arrays are typically not declared before they are used in scripts.
[cs50@flume ~]$ colors=(red orange yellow green blue indigo violet)
[cs50@flume ~]$ echo $colors
red
[cs50@flume ~]$ echo ${colors[1]}
orange
[cs50@flume ~]$ echo ${colors[6]}
violet
[cs50@flume ~]$ echo ${colors[7]}
[cs50@flume ~]$
Above we create one array (colors
).
Notice that $colors
implicitly substitutes the first element, with index 0 (computer scientists like counting from zero).
The later commands show the ${varname}
syntax for variable substitution; this is the general form whereas $varname
is a shorthand that works for simple cases; note that ${message}
is identical to $message
and $colors
is equivalent to ${colors[0]}
.
When desiring to subscript an array variable, you must use the full syntax, as in ${colors[1]}
.
Finally, note that ${colors[7]}
is empty because it was not defined.
Even cooler, the array can be used in combination with file subsitution $(<filename)
and command substitution $(command)
:
[cs50@flume ~]$ cat LFlist
John.P.Kotz.19@dartmouth.edu
joel.j.katticaran.ug@dartmouth.edu
Kaya.M.Thomas.17@dartmouth.edu
trevor.l.davis.18@dartmouth.edu
Thomas.D.Kim.19@dartmouth.edu
kyle.dotterrer.18@dartmouth.edu
[cs50@flume ~]$ lfs=($(<LFlist))
[cs50@flume ~]$ echo ${lfs[3]}
trevor.l.davis.18@dartmouth.edu
[cs50@flume ~]$
[cs50@flume ~]$ juniors=($(grep .18. LFlist))
[cs50@flume ~]$ echo ${juniors[1]}
kyle.dotterrer.18@dartmouth.edu
[cs50@flume ~]$ echo ${lfs[*]}
John.P.Kotz.19@dartmouth.edu joel.j.katticaran.ug@dartmouth.edu Kaya.M.Thomas.17@dartmouth.edu trevor.l.davis.18@dartmouth.edu Thomas.D.Kim.19@dartmouth.edu kyle.dotterrer.18@dartmouth.edu
[cs50@flume ~]$
The last line demonstrates how you can substitute all values of the array, with the [*]
index.
let
me do arithmetic!
The let
command carries out arithmetic operations on variables.
$ let a=1
$ let b=2
$ let c = a + b
-bash: let: =: syntax error: operand expected (error token is "=")
# ... note, the let command is sensitive to spaces.
$ let c=a+b
$ echo $c
3
$ echo "a+b=$c"
a+b=3
$ echo "$a+$b=$c"
1+2=3
$ let a*=10 # equivalent to let a=a*10
$ echo $a
10
Temporary files
If your script needs to create some temporary files to do its work, it is good practice to create those files in a place other than the current directory, and with a filename that is unlikely to be used by another script - even another concurrently running copy of your script.
The directory /tmp
is writable by everyone - so it’s not a great place to put important files - and is the conventional place to put temporary files.
To avoid picking a name that others might pick, scripts include $$
, their process identifier, as part of the filename.
For example, a script print
might do the following:
#!/bin/bash
# build up an output file, then print it
# name of temporary file includes our process id $$
tmpfile=/tmp/print$$
echo > $tmpfile
for arg
do
# print a nice header then the file
echo "======================" >> $tmpfile
echo "$arg" >> $tmpfile
cat "$arg" >> $tmpfile
echo >> $tmpfile
done
lpr $tmpfile # print the result
rm -f $tmpfile # clean up after ourself
exit 0
We use a variable tmpfile
for clarity and consistency throughout the script.
Catching interrupts, cleaning up
Many scripts create intermediate or temporary files, and might leave a mess if interrupted part-way through their operation.
The trap
command can catch such interrupts, such as those caused by the user typing ctrl-C
at the keyboard while the script works.
It is good form to catch this interrupt and clean up before exiting.
In the above example, we would extend the above example as follows:
# name of temporary file includes our process id $$
tmpfile=/tmp/print$$
trap "rm -f $tmpfile" EXIT
This trap
command gives the shell a command to run whenever the script exits, for any reason (whether due to an exit
command or due to an interrupt that kills the process).
Very handy!
Notice that I define the trap
immediately after defining the variable name, so that it will be in effect whenever the temporary file is later created.
The -f
flag (‘force’) to rm
causes it to override some kinds of errors, notably, to not complain if the $tmpfile
does not yet exist.
Sometimes you need a whole directory for your temporary use:
tmpdir=/tmp/print$$
trap "rm -rf $tmpdir" EXIT
mkdir -p $tmpdir
cd $tmpdir
Here I used mkdir -p
to make the directory, and rm -rf
to recursively remove it.
Functions
Like most procedural languages, shell scripts have structure and function support. Typically, it is a good idea to use functions to make scripts more readable and structured. In what follows, we simply add a function to guessprime to create guessprimefunction.sh:
#!/bin/bash
#
# File: guessprimefunction.sh (variant of guessprime.sh)
#
# Description: The user tries to guess a prime between 1-100
# This is not a good program. There is no check on what the
# user enters; it may not be a prime, or might be outside the range.
# Heck - it might not even be a number and might be empty!
# Some defensive programming would check the input.
#
# Input: The user guess a prime and enters it
#
# Output: Status on the guess
# Ask the user to guess, and fill global variable $guess with result.
# usage: askguess low high
# where [low, high] is the range of numbers in which they should guess.
function askguess() {
echo -n "Enter a prime between $1-$2: "
read guess
}
# Program defines a variable called prime and set it to a value.
prime=31
# ask them once
askguess 1 100
while [ $guess != $prime ]; do
# ask again
askguess 1 100
done
exit 0
Notice that defining a function effectively adds a new command to the shell, in this case, askguess
.
And that command can have arguments!
And those arguments are available within the function as if they were command-line arguments $1
, $2
, and so forth.
All other variables are treated as ‘global’ variables, like guess
in this example.
Try this script; it’s very fragile. See what happens when you enter nothing - just hit return at the prompt for a guess. Why does that happen?
Another example: submitx
To submit your Lab solutions you use a command ~cs50/labs/submit
, which actually just runs a bash script ~cs50/labs/submitx
.
You can learn many things from this example; note the frequent checks for possible problems, the carefully quoted variable instantiations, the command chains with &&
to ensure that a command sequence stops at the first error, and only reaches exit 0
if they all succeed without error, and the use of \
to break long lines into readable sequences.
#!/bin/bash
#
# Submit a homework assignment in CS50.
# The assignment must in in ~/cs50/labs/labN, where N is [1..9].
# The entire directory is copied to ~cs50/submit/labs/labN/username,
# where username is the $USER of the user that runs this script.
#
# usage: submit N [extension]
# where N is [1..9]
# where the optional second word is literally "extension" and is used
# to indicate that the student wants to claim an extension on this assignment.
# (In that case, all previously submitted files are deleted.)
usage="usage: $0 N [extension] -- where N is [1..9]"
# Check arguments
if [[ $# -eq 0 || $# -gt 2 ]]
then
echo "$usage"
exit 1
fi
if [[ $# -eq 2 ]]
then if [[ "$2" == "extension" ]]
then
extension=1
else
echo "$usage"
exit 1
fi
fi
let "N=$1"
if [[ $N -lt 1 || $N -gt 9 ]]
then
echo "$usage"
exit 2
fi
lab=lab$N
# destination of their files
dest=~cs50/labs/submissions/$lab/$USER
if [ $extension ]
then
echo Requesting extension for $lab.
else
echo Submitting $lab.
# check their cs50 directory
if [[ ! -d ~/cs50 ]]
then
echo 'oops! you are missing a ~/cs50 directory.'
exit 3
fi
echo Ensuring that your CS50 directory is not visible by any other user...
if chmod go-rwx ~/cs50
then
echo good.
else
echo 'Failed: unable to set permissions on your ~/cs50 directory.'
echo They are:
ls -ld ~/cs50
exit 3
fi
# Prepare to copy from 'source' to 'dest'
source=~/cs50/labs/$lab
echo Checking source directory "$source"...
if [[ ! -d "$source" ]]
then
echo "$source does not exist or is not a directory;"
echo "did you put your lab in the right place?"
exit 4
fi
if [[ ! -x "$source" || ! -r "$source" ]]
then
echo "$source is not searchable or not readable:"
ls -ld "$source"
exit 5
else
echo good.
fi
fi
echo Checking destination directory "$dest"...
if mkdir -p "$dest" && chmod o-rwx "$dest"
then
echo good.
else
echo cannot make directory "$dest"
exit 6
fi
if [ $extension ]
then
echo Removing previously submitted files, if any...
rm -rf "$dest/"*
echo Marking your submission as an extension...
date > $dest/EXTENSION \
&& chgrp -R cs50 "$dest" \
&& chmod -R g=u "$dest" \
&& chmod g+rwx "$dest" \
&& chmod o-rwx "$dest" \
&& echo success! $lab extension requested. \
&& cat $dest/EXTENSION \
&& exit 0
else
echo Copying new or changed files...
rsync -aHv --delete "$source/" "$dest/" \
&& chgrp -R cs50 "$dest" \
&& chmod -R g=u "$dest" \
&& chmod g+rwx "$dest" \
&& chmod o-rwx "$dest" \
&& echo success! $lab submitted \
&& date \
&& exit 0
fi
echo Failed!
exit 99