Shell scripts are very useful things, whether they’re prepared and saved to a file for regular execution (for example, with a scheduler like cron), or just entered straight into the command line. They can perform tasks in seconds that may take days of repetitive work, like, say, resizing or touching up images, replacing text in a large number of HTML documents, converting items from one format to another, or gathering statistics.
This entry is a brief introduction to scripting using the Bash shell, which I find to be the most intuitive, and probably the most common (that said, most of the information here will apply to other shells). We will explore some of the basic building blocks of scripting, such as while and for loops, if statements, and a few common techniques for accomplishing tasks. Along the way, we’ll also take a look at some of the tools that make shell scripting a bit more useful, such as test, and bc. Finally, we’ll put it together with a couple of examples.
Stay tuned over the next few days for a brief tutorial on some very useful unix tools, like awk, sed, sort and uniq, which may become indispensable to you (they have for me).
But first, lets begin with some basics.
Scripting 101
Bash (and it’s siblings) is a lot more than just a launcher; it is more like a programming language interface, which allows users to enter very complicated commands to achieve a wide variety of tasks. The language used is much like any other programming language – it contains for and while loops, if statements, functions and variables.
Commands can be either entered straight into the command line, or saved to files for execution.
Script files invariably begin with a line that’s commonly known as the shbang (hash-bang, referring to the first two characters):
#!/bin/sh
This is a directive that’s read by the shell when a script file is executed – it tells the shell what to use to interpret the commands within. In this case, the /bin/sh application will be used. This is the most common shbang; it can be replaced with #!/usr/bin/perl for perl scripts, or #!/usr/bin/php for php scripts, too.
After the shbang comes the script itself – a series of commands, which will be executed by the interpreter. Comments can be entered in to make the script more readable; these are prefaced by a hash symbol:
# This is a comment
When creating a new script file, I find it easiest to set it as executable, so it can be run by just entering the script name. Alternatively, the script has to be run as a parameter to the interpreter (such as ‘sh script.sh‘). This is annoying, so make the file executable with:
chmod +x script.sh
Now the boring stuff’s covered, lets move on to the basic code structures!
Holding and manipulating values
Variables are used to hold values for use later, and are accessed by a dollar symbol, followed by the variable name. When defining the values of variables, the dollar sign is not used at all. For example:
count=2;
echo $count;
Note particularly the absence of spaces around the equals sign in the first line above. This is required – putting spaces in (like ‘count = 2’) will cause a syntax error.
Numeric variables can have arithmetic operations performed on them using the $((…)) syntax. This allows for simple integer addition, subtraction, division and multiplication. Operations can be combined, and brackets can be used to form complex expressions. For example:
count=$(($count+1));
product=$((count*8));
complex=$(((product+2)*($count-4)));
Note the first line of the previous example – the simple increment. This is quite useful for performing loops with a counter (we’ll have a look at loops soon).
For performing more complicated arithmetic, the ‘bc‘ tool is quite handy. bc is an arbitrary precision calculator language interpreter, and provides basically any mathematical function that could possibly be required.
To use bc, simply ‘pipe’ commands into it, and grab the result on bc’s stdout:
$ echo ‘8/3’ | bc -l
2.66666666666666666666
$ echo ‘a(1)*4’ | bc -l
3.14159265358979323844
$ pi=`echo ‘a(1)*4’ | bc -l`
$ echo $pi
3.14159265358979323844
Note the ‘-l’ parameter to bc – this defines the standard mathlib library, which contains some useful functions (like arctan, or ‘a’, used above). The parameter also makes bc use floating-point numbers by default (without it, bc will only give integer results).
Command-line parameters
Often you will want your shell scripts to take parameters, which modify the behaviour of the script. They can specify a file on which to operate, or a number of times to iterate over a loop, for example. This essentially just passes in a variable into the script, which can then be used.
Command line arguments appear as numbered variables. $0 denotes the command that was run (your script’s name, typically). After that, the arguments to the command are given, as $1, $2, $3, onwards.
For example, the script:
#!/bin/sh
echo $0 utility.
echo Arguments are:
echo $1 – first argument
echo $2 – second argument
echo $3 – third argument
Can be executed with:
$ ./test.sh a b c
./test.sh utility.
Arguments are:
a – first argument
b – second argument
c – third argument
Arguments can also be referred to en masse with the $* special variable, which returns a string containing all arguments.
See ‘Iterating over command-line arguments’ for notes on how to use this.
Making decisions
Making decisions in a script is a very useful thing to be able to do – it can allow you to take actions depending on whether a command succeeded or failed, or it can allow you to perform an action only if it’s applicable (like only backing up to an external drive if it’s plugged in!).
If statements are formatted thus:
if test; then
do_something;
elif test2; then
do_something_else;
else
do_something_completely_different;
fi;
The ‘elif‘ statement is optional, and can be omitted. It can also be duplicated – like any if statement in any other language, you can have as many elif’s as you like.
Note that this can also go on one line. For example: if test; then do_something; else do_something_else; fi
The statement above performs test; if test succeeded, then do_something will be executed. Otherwise, test2 is performed. If that succeeds, do_something_else is executed. Otherwise, do_something_completely_different is executed.
The test in an if statement is a command that is executed; the value returned from the command is used to make the decision.
All command-line applications return a numerical value (this can be any integer value), which is usually utilised to indicate the status of the command upon exiting. A value of zero is usually used to indicate success. A non-zero value usually indicates failure.
You can observe the value returned by a command by using the $? variable immediately after the command exits:
$ ping nosuchhost
ping: cannot resolve nosuchhost: Unknown host
$ echo $?
68
$ ping -c 1 google.com
PING google.com (72.14.207.99): 56 data bytes
64 bytes from 72.14.207.99: icmp_seq=0 ttl=233 time=258.205 ms
— google.com ping statistics —
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 258.205/258.205/258.205/0.000 ms
$ echo $?
0
The if construct tests whether the returned value is zero or nonzero. If it’s zero, the test passes. So, we could write:
if ping -c 1 google.com; then
echo ‘Google is alive. All is well.’;
else
echo ‘Google down! The world is probably about to end’.
fi;
Tests can also be performed in-line, allowing commands to be strung together. The && joiner, placed between commands, tells the shell to execute the right-hand command only if the left-hand command succeeds:
ifconfig ppp0 && echo ‘PPP Connection Alive.’
The || joiner performs similarly, but will only execute the right-hand command if the left-hand command fails:
ifconfig ppp0 || redial_connection
Commands can be grouped in these structures, and strung together – for example, a series of commands that must be executed in sequence, and only if all preceding commands succeed too. Commands can be grouped in brackets, to form fairly complex statements:
$ true && (echo 1 && echo 2 && (true || echo 3) && echo 4) || echo 5
1
2
4
$ false && (echo 1 && echo 2 && (true || echo 3) && echo 4) || echo 5
5
$ true && (echo 1 && echo 2 && (false || echo 3) && echo 4) || echo 5
1
2
3
4
Testing, testing
Now we’ve seen how to act upon the results of a test, it’s a good time to introduce the test utility itself.
test is an application that is used to perform a wide variety of tests on strings, numbers, files. Expressions can be combined to construct fairly complex tests. By way of example, lets look at a few uses of test:
$ test ‘Hello’ = ‘Hello’ && echo ‘Yes’ || echo ‘No’
Yes
$ test ‘Hello’ = ‘Goodbye’ && echo ‘Yes’ || echo ‘No’
No
$ test 2 -eq 2 && echo ‘Yes’ || echo ‘No’
Yes
$ test 2 -lt 20 && echo ‘Yes’ || echo ‘No’
Yes
$ test 2 -gt 20 && echo ‘Yes’ || echo ‘No’
No
$ test -e /etc/passwd && echo ‘Yes’ || echo ‘No’
Yes
See the test manual page for more information.
To perform arithmetic tests on floating point values, the bc tool steps in again (as ‘test’ will only operate on integers):
$ test `echo ‘3.4 > 3.1’ | bc` -eq 1 && echo Yes || echo No
Yes
$ test `echo ‘3.4 > 3.6’ | bc` -eq 1 && echo Yes || echo No
No
Note particularly the single quotes around the ‘>’ expression: without this, the meaning of the expression changes (the value ‘3.4’ will be redirected into the file ‘3.1’ or ‘3.6’).
If a bc expression evaluates to true, bc returns ‘1’. Otherwise, bc returns ‘0’.
For code readability, the test utility is aliased to ‘[‘, and will ignore the ‘]’ character. Thus, test can be used in commands like:
if [ $count -gt 4 ]; then
take_action;
fi;
Gone loopy
Iterating over commands can be great for performing tasks on a large number of items. There are two loop types defined, while and for loops.
While loops
While loops have the following structure:
while test; do
command_1;
command_2;
done;
Here, test is the same as that from if statements (see above). Note the placement of semicolons – after the test, and before the do, in particular.
Note that, like all script elements, while loops can be used on one line, for quick entry on the command line: while test; do command_1; command_2; done;
While loops, like their counterparts in other programming languages, will continue executing until test evaluates to false.
For loops
For loops are defined thus:
for var in set; do
command_1;
command_2;
done;
For loops are used to iterate over a set of values, defined here in set. The variable var is used to iterate over the set: For each iteration, var is set to the next value within set.
Set is a whitespace-delimited string, containing a list of items. For example:
for dir in Documents Pictures Library Music; do
cp -a $dir /backup;
done;
Or
for image in Images/*.jpg; do
convert “$image” -scale 640x480 “$image-scaled.jpg”;
done;
Escaping
To break out of a while or for loop, the ‘break’ command is used. To continue onto the next iteration, thereby skipping the rest of the statements in the loop body, the ‘continue’ command is used. For example:
count=0;
while [ $count -lt 100 ]; do # Iterate 100 times
if [ $count -eq 2 ]; then # Skip the 2nd iteration
continue;
fi;
do_stuff || break; # Stop iterating if do_stuff fails
count=$((count+1)); # Increment ‘count’
done;
Iterating over files
Lets direct our attention to that second-last example:
for image in Images/*.jpg; do
convert “$image” -scale 640x480 “$image-scaled.jpg”;
done;
Note that this will only function correctly if none of the files in ‘Images’ have spaces in their name. As this is a rather dangerous assumption, we best avoid it when we can.
To be honest, I haven’t discovered a way to make this work on files with spaces. Instead, I tend to use the ‘find‘ tool with ‘xargs‘ to perform commands.
The ‘find’ tool will return a list of files that match the provided pattern. The ‘xargs’ utility performs a set of commands on each item it receives as input. We can put the two together with:
find -maxdepth 1 -type f -print0 | xargs -0 -i{} sh -c ‘echo Working on file {}.’
This example finds all files (-type f) in the current directory (-maxdepth 1), and then xargs prints ‘Working on file <filename>.’ for each one.
The -print0 argument to find forces the utility to delimit files with a ‘null’ character instead of the default, newline. This makes for safer filename handling. It has to be used with the -0 argument in xargs, which will use null character as the delimiter in the input.
The -i{} parameter tells xargs to use the ‘{}’ sequence to denote where the filename should be placed in the command. Arguments afterwards are executed. The argument “sh -c ‘echo Working on file {}.” here will make the shell execute the echo command.
Note that the echo command could be used without ‘sh’, like: xargs -0 -i{} echo Working on file {}.
This is fine if only one command is used. However, if more than one command is to be executed, or more complex commands are to be used, these commands need to be interpreted with ‘sh’. As xargs is just a simple execution tool, it doesn’t understand shell scripts.
Thus, complex statements can be put together. For example (note that this is one command spread across two lines):
find -maxdepth 1 -type f -print0 | xargs -0 -i{} sh -c ‘echo Working on file {}.; copy_file_to_server {} || echo Upload of {} failed.’
Iterating over command-line arguments
Often, you will want to make shell scripts take a series of arguments that are then iterated over. For example, a script may take a list of images to manipulate, or text files to edit.
Such a utility would be invoked with:
$ my_script.sh *.jpg
If any of the arguments had spaces in them (in this case, for example, a jpg called ‘My Trip.jpg’), this can be a little tricky to handle.
Although the arguments would be passed correctly (that is, one of the arguments would indeed contain the text ‘My Trip.jpg’), it is difficult to iterate over them correctly. If a for loop were to be used:
for img in $*; do
manipulate_image $img;
done;
…Spaces within filenames would cause problems. In our example, instead of ‘My Trip.jpg’ being passed to manipulate_img, it would be split – first ‘My’ would be passed to manipulate_img, followed by ‘Trip.jpg’! Nasty.
A technique I often use is to make use of the shift command, which discards the first argument, and moves all other arguments down one. This is a more robust technique:
while [ “$1” ]; do
manipulate_image $1;
shift;
done;
This will take the first argument, act upon it, then move the next argument down for the next loop.
The loop will finish when there are no more arguments, and “$1” will return an empty string, which evaluates to ‘false’.
Final words
That’s about it for this brief tutorial. Hopefully you have enough to start assembling scripts and powerful commands to help you out. There’s a huge amount more to know about shell scripting though – arrays, clever variable manipulation, and plenty more stuff that I’m entirely unaware of, I’m sure. If you want to know more, just do some Googling for shell scripting – there’s an insanely large number of resources out there.
Stay tuned over the next couple of days – I’ll post a brief guide to using some fairly nice tools, like awk, sed, uniq and sort. These little rippers are fantastic for manipulating text and gathering statistics. Trust me, once you know how to use them, you’ll use them all the time (I do!).
For now, I’ll leave you with a final example – this is a small script I wrote the other day to replace the ‘rm’ command, and move all ‘deleted’ items to the trash, instead of just deleting them outright. Here it is:
#!/bin/sh
if [ “$1” = ‘-rf’ -o “$1” = ‘-r’ ]; then
shift;
recursive=true;
fi;
while [ “$1” ] ; do
# If not recursive, skip directories
if [ -d “$1” -a ! “$recursive” ]; then
echo $0: $1: is a directory; shift; continue;
fi;
[ ! -d ~/.Trash/“`pwd`” ] && mkdir -p ~/.Trash/“`pwd`”;
mv “$1” ~/.Trash/“`pwd`”;
shift;
done;
Car shopping is stressful. Now that there are hundreds of makes and models to choose from, not to mention promotions and payment options, it’s easy to become frustrated and stressed out. The information here will help make buying a car as easy and stress-free as possible.