Blog Projects, Tips, Tricks, and How-Tos

Writing Better Shell Scripts – Part 1

Quick Start

The information presented in this post doesn’t really lend itself to having a “Quick Start” section, but if you’re in a hurry we have a How-To section along with Video and Audio included with this post that may be a good quick reference for you. There are some really great general references in the Resources section that may help you as well.

Video

General Debugging

BASHDB Overview

Audio

Download

Preface

To make things easier on you, all of the black command line and script areas are set up so that you can copy the text from them. This does make using the commands and scripts easier, but if you’re not already familiar with the concepts presented here, typing things yourself and working through why you’re typing them will help you learn more. If you hit problems along the way, take a look at the Troubleshooting section near the end of this post for help.

There are formatting conventions that are used throughout this post that you should be aware of. The following is a list outlining the color and font formats used.

Command Name or Directory Path
Warning or Error
Command Line Snippet With Commands/Options/Arguments
Command Options and Their Arguments Only
Hyperlink

Overview

This post is the first in a series on shell script debugging, error handling, and security. Although I’ll be presenting some methodologies and techniques that apply to all shell languages (and most programming languages), this series will focus very heavily on BASH. Users of other shells like CSH will need to do some homework to see what information transfers and what does not.

One of the difficulties with debugging a shell script is that BASH typically doesn’t give you very much information to go on. You might get error output showing a line number, but that’s just the line where the shell became aware of the error, not necessarily the line where the error actually occurred. Add in a vague error message such as the one in Listing 1, and it gets difficult to tell what’s going on inside your script.

Listing 1

$ ./buggy_script.sh ./buggy_script.sh: line 23: syntax error: unexpected end of file

This post is written with the intent of giving you knowledge that will help when you see an error like the one in Listing 1 while trying to run a script. This type of error is just one of many errors that the shell may give you, and is more easily dealt with when you have a good understanding of scripting syntax and the debugging tools at your disposal.

Along with talking about debugging tools/techniques, I’m going to introduce a handy script debugger called BASHDB. BASHDB allows you to step through a script in much the same way as a program debugger like GNU’s GDB does with C code.

By the end of this post you should be armed with enough knowledge to handle the majority of debugging needs that you have. There’s a lot of information here, but taking the time to learn it will help make you more effective in your work with Linux.

Command Line Script Debugging

BASH has several command line options for debugging your shell scripts, and some of these are shown in Listing 2. These options will be applied to your entire script though, so it’s an all-or-nothing trade off. Later in this post I’ll talk about more selective methods of debugging.

Listing 2

-n Checks for syntax errors without executing the script (noexec). -u Causes an error to be thrown whenever you try to access a variable that has not been set (nounset). -v Sends all lines to standard error (stderr) as they are read, even comments. -x Turns on execution tracing (xtrace) which displays each command as it is executed.

All of the options in Listing 2 can be used just like options with other programs (bash -x scriptname), or with the built-in set command as shown later. With the -x option, the number of + characters before each of the lines of output denotes the subshell level. The more + characters there are, the further down into nested subshells you are. If there are no + characters at the start of the line, then the line is the normal output from the execution of the script. You can use the -x and -v options together for verbose execution tracing, but the amount of output can become a little overwhelming. Using the -n and -v options together provides a verbose syntax check without executing the script.

If you decide to use the -x and -v options together, it can be helpful to use redirection in conjunction with a pager like less, or the tee command to help you handle the information. The shell sends debugging output to stderr and the normal output to stdout, so you’ll need to redirect both of them if you want the full picture of what’s going on. To do this and use the less pager to handle the information, you would use a command line like bash -xv scriptname 2>&1 | less . Instead of seeing the debugging output scroll by in the shell, you’ll be placed into the less pager where you’ll have access to functions like scrolling and search. While using the pager in this way, it’s possible that you may get an error like Broken pipe if you exit the pager before the script is done executing. This error has to do with the script trying to write output to something (less) that’s no longer there, and in this case can be ignored.

If you would prefer to redirect the debugging output to a file for later review and/or processing, you can use tee: bash -xv scriptname 2>&1 | tee scriptname.dbg . You will see the debugging output scroll by on the screen, but if you check the current working directory you will also find the scriptname.dbg file which holds the redirected output. This is what the tee command does for you. It allows you to send the output to a file while still displaying it on the screen. If the script will take awhile to run you can alter the redirection operator slightly, put the script in the background, and then use tail -f scriptname.dbg to follow the updates to the file. You can see this in action in Listing 3, where I’ve created a script that runs in an infinite loop (the code is incorrect on purpose) generating output every 20 seconds. I start the script in the background, redirecting the output to the infinite_loop.dbg file only (not to the screen too). I then start the tail -f command to follow the file for a few iterations, and then hit Ctrl-C to interrupt the tail command. Once you understand how to redirect the debugging output in this way, it’s fairly easy to figure out how to split the debugging and regular output into separate files.

Listing 3

$ bash -xv infinite_loop.sh &> infinite_loop.dbg & [1] 9777 $ tail -f infinite_loop.dbg num=0 + num=0 while [ $num -le 10 ] do sleep 2 echo "Testing" done + '[' 0 -le 10 ']' + sleep 2 + echo Testing Testing + '[' 0 -le 10 ']' + sleep 2 ^C

Internal Script Debugging

This section is called “Internal Script Debugging” because it focuses on changes that you make to the script itself to add debugging functionality. The easiest change to make in order to enable debugging is to change the shebang line of the script (the first line) to include the shell’s normal command line switches. So, instead of a shebang line like #!/bin/bash - you would have #!/bin/bash -xv. There are also both external and built-in commands for the BASH shell that make it easier for you to debug your code, the first of which is set.

The set command allows you to set shell options while your script is running. The options of the most interest for our purposes are the ones from Listing 2. For example, you can enclose sections of your script between the set -x and set +x command lines. By doing this you enable debugging for only the section of code within those lines, giving you control over what specific section of the script is debugged. Listing 4 shows a very simple script using this technique, and Listing 5 shows the script in action.

Listing 4

#!/bin/bash - # File: set_example.sh echo "Output #1" set -x #Debugging on echo "Output #2" set +x #Debugging off echo "Output #3"

Listing 5

$ ./set_example.sh Output #1 + echo 'Output #2' Output #2 + set +x Output #3

As you can see, the debugging output looks like you started the script with the bash -x command line. The difference is that you get to control what is traced and what is not, instead of having the execution of the whole script traced. Notice that the command to disable execution tracing (set +x) is included in the execution trace. This makes sense because execution tracing is not actually turned off until after the set +x line is done executing.

Output statements (echo/print/printf) are useful for getting information from your script at specific points. You can use output statements to track the progression of logic throughout your script by doing things like evaluating variable values and shell expansions, and finding infinite loops. Another advantage of using output statements is that you can control the format. When using command line debugging switches you have little or no control over the format, but with echo, print, and printf, you have the opportunity to customize the output to display in a way that makes sense to you.

You can utilize a DEBUG function to provide a flexible and clean way to turn debugging output on and off in your script. Listing 6 shows the script in Listing 4 with the addition of the DEBUG function, and Listing 7 shows one way to switch the debugging on and off from the command line using a variable.

Listing 6

#!/bin/bash - # File: func_example.sh # This function can be used to selectively enable/disable debugging. # Use with the set command to debug sections of the script. function DEBUG() { # Check to see if the enable debugging variable is set if [ -n "${DEBUG_ENABLE+x}" ] then # Run whatever command/option/argument combo that was # passed to our DEBUG function. $@ fi } echo "Output #1" DEBUG set -x #Debugging on echo "Output #2" DEBUG set +x #Debugging off echo "Output #3"

Listing 7

$ ./func_example.sh #Without debugging Output #1 Output #2 Output #3 $ DEBUG_ENABLE=true ./func_example.sh #With debugging Output #1 + echo 'Output #2' Output #2 + DEBUG set +x + '[' -n x ']' + set +x Output #3

The DEBUG function treats the rest of the line after it as an argument. If the DEBUG_ENABLE variable is set, the DEBUG function will output it’s argument (the rest of the line) as a command via the $@ operator. So, any line that has DEBUG in front of it can be turned on or off by simply setting/unsetting one variable from the command line or inside your script. This method gives you a lot of flexibility in how you set up debugging in your script, and allows you to easily hide that functionality from your end users if needed.

Instead of requiring a user to set an environment variable on the command line to enable debugging, you can add command line options to your script. For instance, you could have the user run your script with a -d option (./scriptname -d) in order to enable debugging. The mechanism that you use could be as simple as having the -d option set the DEBUG_ENABLE variable inside of the script. An example of this, with the addition of multiple debugging levels, can be seen in the Scripting section.

Another technique that you can use to track down problems in your script is to write data to temporary files instead of using pipes. Temp files are many times slower than pipes though, so I would use them sparingly and in most cases only for temporary debugging. There is a Linux Journal article by Dave Taylor (April 2010) referenced in the Resources section that talks about using temporary files in the article’s script. In a nutshell, you replace the pipe operator (|) with a redirection to file (> $temp), where $temp is a variable holding the name of your temporary file. You read the temporary file back into the script with another redirection operator (< $temp). This allows you to examine the temporary file for errors in the script’s pipeline. Listing 8 shows a very simplified example of this.

Listing 8

#!/bin/bash - # Set the path and filename for the temp file temp="./example.tmp" # Dump a list of numbers into the temp file printf "1n2n3n4n5n" > $temp # Process the numbers in the temp file via a loop while read input_val do # We won't do any real work, just output the values echo $input_val done < $temp # Feeds the temp file into the loop # Clean up our temp file rm $temp

The last debugging technique that I'm going to touch on here is writing to the system log. You can use the logger command to write debugging output to /var/log/messages, or another file if you use the -f option. I consider this technique to be primarily for production scripts that have already been released to your users, and you don't want to abuse this mechanism. Flooding your system log with script debugging messages would be counter productive for you and/or your system administrator. It's best to only log mission critical messages like warnings or errors in this way.

To use the logger command to help track script debugging information, you would just add a line like logger "${BASH_SOURCE[0]} - My script failed somewhere before line $LINENO." to your script. The line that this adds in the system log looks like the output line in Listing 9. There are a couple of variables that I've thrown in here to make my entry in the system log more descriptive. One is BASH_SOURCE, which is an array that in this case holds the name and path of the script that logged the message. The other is LINENO, which holds the current line number that you are on in your script. There are several other useful environment variables built into the newer versions of BASH (>= 3.0). Some of these other variables (all arrays) include BASH_LINENO, BASH_ARGC, BASH_ARGV, BASH_COMMAND, BASH_EXECUTION_STRING, and BASH_SUBSHELL. See the BASH man page for details.

Listing 9

$ tail -1 /var/log/messages May 28 14:35:35 testhost jwright: ./logger_test.sh - My script failed somewhere before line 11.

Introducing BASHDB

As I mentioned before, BASHDB is a debugger that does for BASH scripts what GNU's GDB does for C/C++ programs. BASHDB can do a lot, and it has four main features to help you eliminate errors from your scripts. First, It can start a script with options, arguments, and anything else that might affect its operation. Second, it allows you to set conditions on which a script will stop. Third, it gives you the ability to examine what's going on at the point in a script where it's stopped. Fourth, BASHDB allows you to manipulate things like variable values before telling the script to move on.

You can type bashdb scriptname to start BASHDB and set it to debug the script scriptname. Listing 10 shows a couple of useful options for the bashdb program.

Listing 10

-X Traces the entire script from beginning to end without putting bashdb in interactive mode. Notice that it's capital X, not lowercase. -c Tests/traces a single string command. For example, "bashdb -c ls *" will allow you to step through the command string "ls *" inside the debugger.

In order to show where you're at, BASHDB displays the full path and current line number of the running script above the prompt. In interactive mode, the prompt BASHDB gives you looks something like bashdb<(1)> where 1 is the number of commands that have been executed. The parentheses around the command number denote the number of subshells you are nested within. The more parentheses there are, the deeper into subshells you are nested. Listing 11 gives a decent command reference that you can use when debugging scripts at the BASHDB interactive mode prompt.

Listing 11

- Lists the current line and up to 10 lines that came before it. backtrace Abbreviated "T". Shows the trace of calls including things like functions and sourced files that have brought the script to where it is now. You can follow "backtrace" with a number, and only that number of calls will be shown. break Abbreviated "b". Sets a persistent breakpoint at the current line unless followed by a number, in which case a breakpoint is set at the line specified by the number. See the "continue" command for a shortcut to specifying the line number. continue Abbreviated "c". Resumes execution of the script and moves to the next stopping point or breakpoint. If followed by a number, "continue" works in a similar way as issuing the "break" command followed by the number and then the continue command. The difference is that "continue" sets a one time breakpoint whereas "break" sets a persistent one. edit Opens the text editor specified by the EDITOR environment variable to allow you make and save changes to the current script. Typing "edit" by itself will start editing on the current line. If "edit" is followed by a number, editing will start on the line specified by that number. Once you're done editing you have to type "restart" or "R" to reload and restart the script with your changes. help Abbreviated "h". Lists all of the commands that are available when running in interactive mode. When you follow "help" or "h" with a command name, you are shown information on that command. list Abbreviated "l". Lists the current line and up to 10 lines that come after it. If followed by a number, "list" will start at the specified line and print the next 10 lines. If followed by a function name, "list" starts at the beginning of the function and prints up to 10 lines. next Abbreviated "n". Moves execution of the script to the next instruction, skipping over functions and sourced files. If followed by a number, "next" will move that number of instructions before stopping. print Abbreviated "p". When followed by a variable name, prints the value of a specified variable. Example: print $VARIABLE quit Exits from BASHDB. set Allows you to change the way BASH interacts with you while running BASHDB. You can follow "set" with an argument and then the words "on" or "off" to enable/disable a feature. Example: "set linetrace on". step Abbreviated "s". Moves execution of the script to the next instruction. "step" will move down into functions and sourced files. See the "next" command if you need behavior that skips these. If followed by a number, "step" will move that number of instructions before stopping. x Similar to the "print" command, but more powerful. Can print variable and function definitions, and can be used to explore the effects of a change to the current value of a variable. Example: "x n-1" subtracts 1 from the variable "n" and displays the result.

Normally when you hit the Enter/Return key without entering a command, BASHDB executes the next command. This behavior is overridden though when you have just run the step command. Once you've run step, pressing the Enter/Return key will re-execute step. The rest of the operation of BASHDB is fairly straight forward, and I'll run through an example session in the How-To section.

If you're a person who prefers to use a graphical interface, have a look at GNU DDD. DDD is a graphical front end for several debuggers including BASHDB, and includes some interesting features like the ability to display data structures as graphs.

How-To

If you've been reading this post straight through, you can see that there are a lot of script debugging tools at your disposal. In this section, I'm going to go through a simple example using a few of the different methods so that you can see some practical applications. Listing 12 shows a script that has several bugs intentionally added so that we can use it as our example.

Listing 12

#!/bin/bash - # buggy_script.sh is designed to help us learn about # shell script debugging # if [-z $1 ] # Space left out after first test bracket then echo "TEST" #fi #The closing fi is left out # Use of uninitialized variable echo "The value is: $VALUE1" # Infinite loop caused by not incrementing num num=0 while [ $num -le 10 ] do sleep 2 echo "Testing" done

When I try to run the script for the first time I get the same error that we got in Listing 1. The first thing that I'm going to do is use the -x and -u options of BASH to run the script with extra debugging output (bash -xu ./buggy_script.sh). When I rerun the script this way, I see that I don't really gain anything because BASH detects the unexpected end of file bug before it even tries to execute the script. The line number isn't any help either since it just points me to the very last line of the script, and that's not very likely to be where the error occurred. I'll run into the same problems if I try to run the script with BASHDB as well.

I remember that the rule of thumb with unexpected end of file errors is that they usually mean that I've forgotten to close something out. It could be an if statement without a fi at the end, a case statement that's missing an esac or ;;, or any number of other constructs that require closure. When I start looking through the script I notice that my if statement is missing a fi, so I add (uncomment) that. This particular bug teaches us an important lesson - that there will always be some errors that will require us to do some digging on our own. We may be able to use our debugging techniques to get us close to the error, but in the end we have to know
the language well enough to be able to spot syntax errors. Once I add the fi statement, I'm ready to rerun the script. The second time the script runs, I get an unbound variable error.

Listing 13

$ bash -xu ./buggy_script.sh ./buggy_script.sh: line 6: $1: unbound variable

You can see in the error that a command line argument ($1) is unbound. This tells me that I forgot to add an argument after ./buggy_script.sh . I end up with the command line bash -xu ./buggy_script.sh testarg1 which gives me the next two errors shown in Listing 14.

Listing 14

$ bash -xu ./buggy_script.sh testarg1 + '[-z' testarg1 ']' ./buggy_script.sh: line 6: [-z: command not found ./buggy_script.sh: line 12: VALUE1: unbound variable

Execution tracing shows me that the last command executed is [-z' testarg1 '] . The first error tells me that for some reason the start of the test statement ([-z) is being treated as a command. I think about it for a second and remember that there has to be a space between test brackets and what they enclose. The statement [-z $1 ] should read [ -z $1 ] . Since I try to focus on one error at a time, I fix the test statement and rerun the script. The first error from Listing 14 goes away, but the second error remains. You can see that it's another unbound variable error, but this time it's referencing a variable that I created and not a command line argument. The problem is that I use the variable VALUE1 in an echo statement before I've even set a value for it. In this case that would just leave a blank at the end of the echo statement, but in some cases it can cause more serious problems. This is what using the -u option of BASH does for you. It warns you that a variable doesn't have a value before you try to use it. To correct this error, I add a statement right above the echo line that sets a value for the variable (VALUE1="1").

After fixing the above errors and rerunning the script, everything seems to work fine. The only problem is that even though I set the while loop up to quit after the variable num gets to 10, the loop doesn't exit. It seems that I have an infinite loop problem. This loop is simple enough that you can probably just glance at it and see the problem, but for the sake of the example we're going to take the long way around. I add an echo statement (echo "num Value: $num") to show me the value of the num variable right above the sleep 2 line. When I run the script again without the BASH -x option (to cut out some clutter), I get the output shown in Listing 15.

Listing 15

$ bash -u ./buggy_script.sh testarg1 The value is: 1 num Value: 0 Testing num Value: 0 Testing num Value: 0

You can see that the output from the echo statement I added is always the same (num Value: 0). This tells me that the value of num is never incremented and so it will never reach the limit of 10 that I set for the while loop. The fix is to use arithmetic expansion to increment the num variable by 1 each time around the while loop: num=$((num+1)) . When I run the script now, num increments like it should and the script exits when it's supposed to. With this bug fixed, it looks like we've eliminated all of the errors from our script. The finalized script with the num evaluation echo statement removed can be seen in Listing 16.

Listing 16

#!/bin/bash - # buggy_script.sh is designed to help us learn about # shell script debugging. if [ -z $1 ] # Space added after first test bracket then echo "TEST" fi #The closing fi was added # Set a value for our variable VALUE1="1" # Use of initialized variable echo "The value is: $VALUE1" # Finite loop caused by incrementing num num=0 while [ $num -le 10 ] do sleep 2 echo "Testing" num=$((num+1)) done

Now I'll walk you through correcting the same buggy script using BASHDB. As I said above, the unexpected end of file error is best solved by applying your understanding of shell scripting syntax. Because of this, I'm going to start debugging the script right after we notice and fix the unclosed if statement. To start the debugging process, I use the line bashdb ./buggy_script.sh to launch BASHDB and have it start to step through the script. If you compiled BASHDB from source and haven't installed it, you'll need to adjust the paths in the command line accordingly.

BASHDB starts the script and then stops at line 7, the if statement. I then use the step command to move to the next instruction and get the output in Listing 17.

Listing 17

$ bashdb ./buggy_script.sh bash Shell Debugger, release 4.0-0.4 Copyright 2002, 2003, 2004, 2006, 2007, 2008, 2009 Rocky Bernstein This is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. (/home/jwright/Documents/Scripts/Learning/buggy_script.sh:7): 7: if [-z $1 ] # Space left out after first test bracket bashdb<0> step ./buggy_script.sh: line 7: [-z: command not found (/home/jwright/Documents/Scripts/Learning/buggy_script.sh:13): 13: echo "The value is: $VALUE1"

Notice that until I run the step command, BASHDB doesn't give me an error for line 7. That's because it has stopped on the line 7 instruction, but hasn't executed it yet. When I step through that instruction and on to the next one, I get the same error as the BASH shell gives us ([-z: command not found). As before, we realize that we've left a space out between the test bracket and the statement. To fix this, I type the edit command to open the script in the text editor specified by the EDITOR environment variable. In my case this is vim. I have to type visual to go to normal mode, and then I'm able to edit and save my changes to the script like I would in any vi/vim session. With the space added, I save the file and exit vim which puts me back at the BASHDB prompt. I type the R character and hit the Enter/Return key to restart the script, which also loads my changes. I end up right back at line 7 again.

This time when I use the step command, BASHDB moves past the if statement and stops right before executing line 13 (the next instruction). Everything looks good, so I use the step command again by simply hitting the Enter/Return key. The output in Listing 18 is what I see.

Listing 18

bashdb<1> edit bashdb<2> R Restarting with: /usr/local/bin/bashdb ./buggy_script.sh bash Shell Debugger, release 4.0-0.4 Copyright 2002, 2003, 2004, 2006, 2007, 2008, 2009 Rocky Bernstein This is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. (/home/jwright/Documents/Scripts/Learning/buggy_script.sh:7): 7: if [ -z $1 ] # Space left out after first test bracket bashdb<0> step (/home/jwright/Documents/Scripts/Learning/buggy_script.sh:13): 13: echo "The value is: $VALUE1" bashdb<1> The value is: (/home/jwright/Documents/Scripts/Learning/buggy_script.sh:16): 16: num=0

We see that the echo statement ends up not having any text after the colon, which is not what we want. What I'll do is issue an R (restart) command and then step back to line 13 so that I can check the value of the variable. Once I'm back at the echo statement on line 13, I use the command print $VALUE1 to inspect the value of that variable. A snippet of the output from the print command is in Listing 19.

Listing 19

7: if [ -z $1 ] # Space left out after first test bracket bashdb<0> step (/home/jwright/Documents/Scripts/Learning/buggy_script.sh:13): 13: echo "The value is: $VALUE1" bashdb<1> print $VALUE1 bashdb<2>

There's a blank line between the bashdb<1> print $VALUE1 and bashdb<2> lines. This tells me that there is definitely not a value (or there's a blank string) set for the VALUE1 variable. To correct this I go back into edit mode, and add the variable declaration VALUE1="1" just above our echo statement. I follow the same edit, save, exit, restart (with the R character) routine as before, and then step down through the echo statement again.

This time the output from the echo statement is The value is: 1 which is what we would expect. With that error fixed, we continue to step down through the script until we realize that we're stuck in our infinite while loop. We can use the print statement here as well, and with the line print $num we see that the num variable is not being incremented. Once again, we enter edit mode to fix the problem. We add the statement num=$((num+1)) at the bottom of our while loop, save, exit, and restart. We now see that the num variable is incrementing properly and that the loop will exit. We can type the continue command to let the loop finish without any more intervention.

After the script has run successfully, you'll see the message Debugged program terminated normally. Use q to quit or R to restart. If you haven't been adding comments as you go, it would be a good idea at this point to re-enter edit mode and add those comments to any changes that you made. Make sure to run your script through one more time though to make sure that you didn't break anything during the process of commenting.

That's a pretty simple BASHDB session, but my hope is that it will give you a good start. BASHDB is a great tool to add to your shell script development toolbox.

Tips and Tricks

  • If you're like many of us, you may have trouble with quoting in your scripts from time to time. If you need a hint on how quoted sections are being interpreted by the shell, you can replace the command that's acting on the quoted section with the echo command. This will give you output showing how your quotes are being interpreted. This can also be a handy trick to use when you need insight into other issues like shell expansion too.
  • If you don't indent temporary (debugging) code, it will be easier to find in order to remove it before releasing your script to users. If you don't already make a habit of indenting your scripts in the first place, I would recommend that you start. It greatly increases the readability, and thus maintainability, of your scripts.
  • You can set the PS4 environment variable to include more information with the shell's debugging output. You can add things like line numbers, filenames, and more. For example, you would use the line export PS4='$LINENO ' to add line numbers to your script's debugging output. The creator of the bashdb script debugger sets the PS4 variable to (${BASH_SOURCE}:${LINENO}): ${FUNCNAME[0]} - [${SHLVL},${BASH_SUBSHELL}, $?] which gives you very detailed information about where you're at in your script. You can make this change to the variable permanent by adding an export declaration to one of your bash configuration files.
  • Make sure to use unique names for your shell scripts. You can run into problems if you name your shell script the same as a system or built-in command (i.e. test). I like to make my shell script names distinctive, and for added protection I almost always add a .sh extension onto the end of the filename.

Scripting

These scripts are somewhat simplified and in most cases could be done other ways too, but they will work to illustrate the concepts. If you use these scripts, make sure you adapt them to your situation. Never run a script or command without understanding what it will do to your system.

Our first script example is going to have two separate parts to it. The first is a script in which we've enclosed our debugging functionality from above. This is a case where it's helpful to create modular code so that other scripts can add debugging functionality simply by sourcing one file. That way you're not duplicating code needlessly for commonly used functionality. The second script implements the debugging script, and uses a command line option (-d) to enable debugging. The script also uses multiple debugging levels to allow the user to control how verbose the output is by passing an argument to the -d option.

Listing 20

#!/bin/bash - # File:debug_module.sh # Holds common script debugging functionality # Set the PS4 variable to add line #s to our debug output PS4='Line $LINENO : ' # The function that enables the enabling/disabling of # debugging in the script, and also takes the user # specified debug level into account. # 0 = No debugging # 1 = Debug executed statements only # 2 = Debug all lines and executed statements function DEBUG() { # We need to see what level (0-2) of debugging is set if [ "$1" = "0" ] #User disabled debugging then echo "Debugging Off" set +xv # Set the variable that tracks the debugging state _DEBUG=0 elif [ "$1" = "1" ] #User wants minimal debugging then echo "Minimal Debugging" set -x # Set the variable that tracks the debugging state _DEBUG=0 elif [ "$1" = "2" ] #User wants maximum debugging then echo "Maximum Debugging" set -xv # Set the variable that tracks the debugging state _DEBUG=0 else #Run/suppress a command line depending on debug level # If debugging is turned on, output the line # that this function was passed as a parameter if [ $_DEBUG -gt 0 ] then $@ fi fi }

This script has two main purposes. One is to set the PS4 variable so that line numbers are added to the debugging output to make it easier to trace errors. The other is to provide a function that takes an argument of either a number (0-2), or a command line and then decides what to do with it. If the argument is a number from 0 to 2, the function sets a debugging level accordingly. Level 0 turns off all debugging (set +xv), level 1 turns on execution tracing only (set -x), and level 2 turns on execution tracing and line echoing (set -xv). Anything else that is passed to the function is treated as a command line that is either run or suppressed depending on what the debugging level is.

As always, there are many ways to improve this script. One would be to add more debugging levels to it. I created three (0-2), which accommodated only the -x and -v options. You could add another level for the -u option, or create your own custom levels. Listing 21 shows an implementation of our simple modular debugging script.

Listing 21

#!/bin/bash - # File: debug_module_test.sh # Used as a test of the debug_module.sh script # Source the debug_module.sh script so that its # function(s) will be used as this script's own . ./debug_module.sh # Parse the command line options and set this script up for use while getopts "d:h" opt do case $opt in d) _DEBUG=$OPTARG # Enable debugging DEBUG $_DEBUG ;; h) echo "Usage: $0 [-dh]" #Give the user usage info echo " -d Enables debugging mode" echo " -h Displays this help message" exit 0 ;; '?') echo "$0: Invalid Option - $OPTARG" echo "Usage: $0 [-dh]" exit 1 ;; esac done # Begin our test statements DEBUG echo "Debugging 1" DEBUG echo "Debugging 2" echo "Regular Output Line" # Turn debugging off DEBUG 0 # Test to make sure debugging is off DEBUG echo "Debugging 3" # You can also create your own custom debugging output sections _DEBUG=2 #Manually set debugging back to max for last section [ $_DEBUG -gt 0 ] && echo "First debugging level" [ $_DEBUG -gt 1 ] && echo "Second debugging level"

The first statement that you see in the Listing 21 script is a source statement reading the modular debugging script (debug_module.sh). This treats the debugging script as if it was part of the script we're currently running. The next major section that you see is the while loop that parses the command line options and arguments. The main option to be concerned with is "d", since it's the one that enables or disables debugging output. The getopts command requires the -d option to have an argument on the command line via the getopts "d:h" statement. The user passes a 0, 1, or 2 to the option and that in turn sets the debugging level via the _DEBUG variable and the DEBUG function. The DEBUG function is called 4 more times throughout the rest of the script. Three of those times it is used as a switch to run or suppress a line of the script, and once it is used to reset the debugging level to 0 (debugging off).

The last three lines of the script are a little different. I put them in there to show how you could implement your own custom debugging functionality. In the first of those lines, the _DEBUG variable is set to 2 (maximum debugging output). The next two lines are used to select how much debugging output you see. When you set _DEBUG to 1, the line "First debugging level" is output. If you set _DEBUG to 2 as in the script, the conditions for both the "First debugging level" (> 0) and the "Second debugging level" (> 1) statements are met, so both lines are output. Listing 22 shows the output that you get from running this script, and if you look at the bottom you'll see that the lines "First debugging level" and "Second debugging level" are output.

Listing 22

$ ./debug_module_test.sh -d 1 Minimal Debugging Line 29 : _DEBUG=0 Line 11 : getopts d:h opt Line 30 : DEBUG echo 'Debugging 1' Line 18 : '[' echo = 0 ']' Line 24 : '[' echo = 1 ']' Line 30 : '[' echo = 2 ']' Line 39 : '[' 0 -gt 0 ']' Line 32 : DEBUG echo 'Debugging 2' Line 18 : '[' echo = 0 ']' Line 24 : '[' echo = 1 ']' Line 30 : '[' echo = 2 ']' Line 39 : '[' 0 -gt 0 ']' Line 34 : echo 'Regular Output Line' Regular Output Line Line 37 : DEBUG 0 Line 18 : '[' 0 = 0 ']' Line 20 : echo 'Debugging Off' Debugging Off Line 21 : set +xv First debugging level Second debugging level

This next script is somewhat like an automated unit test. It's a wrapper script that automatically runs another script with varying combinations of options and arguments so that you can easily look for errors. It takes some time up front to create this script, but it allows you to quickly test how any changes you make to a test script might cause problems for the end user. It could take a lot of time to step through and test all of the option/argument combinations manually on a complex script, and with that extra work (if we're honest) this test might get left out all together. That's where the automation of the script in Listing 23 comes in.

Listing 23

#!/bin/bash - # File unit_test.sh # A wrapper script that automatically runs another script with # a varying combination of predefined options and arguments, # to help find any errors. # Variables to make the script a little more readable. _TESTSCRIPT=$1 #The script that the user wants to test _OPTSFILE=$2 #The file holding the predefined options _ARGSFILE=$3 #The file holding the predefined arguments # Read the options and arguments from their files into arrays. _OPTSARRAY=($(cat $_OPTSFILE)) _ARGSARRAY=($(cat $_ARGSFILE)) # The string that holds the option/argument combos to try. _TRIALSTRING="" # Step through all of the arguments one at a time. for _ARG in ${_ARGSARRAY[*]} do # The string of multiple command line options that we'll # build as we step through the available options. _OPTSTRING="" # Step through all of the options one at a time. for _OPT in ${_OPTSARRAY[*]} do # Append the new option onto the multi-option string. _OPTSTRING="${_OPTSTRING}$_OPT " # Accumulate the command lines that will be tacked onto # the command as we're testing it. _TRIALSTRING="${_TRIALSTRING}${_OPT} $_ARGn" #Single option _TRIALSTRING="${_TRIALSTRING}${_OPTSTRING}$_ARGn" #Multi-option done done # Change the Internal Field Separator to avoid newline/space troubles # with the command list array assignment. IFS=":" # Sort the lines and make sure we only have unique entries. This could # be taken care of by more clever coding above, but I'm going to let # the shell do some extra work for me instead. An array is used to hold # the command lines. _CLIST=($(echo -e $_TRIALSTRING | sort | uniq | sed '/^$/d' | tr "n" ":")) # Step through each of the command lines that were built. for _CMD in ${_CLIST[*]} do # We can pipe the full concatenated command string into bash to run it. echo $_TESTSCRIPT $_CMD | bash done

There are two files that I created to go along with this test script. The first is sample_opts, which holds a single line of possible options separated by spaces (-d -v -q). These options stand for debugging mode, verbose mode, and quiet mode respectively. The second file that I create is sample_args, which contains two possible arguments separated by a space (/etc/passwd /etc/shadow). I'll run our unit_test.sh script by passing it the name of the script to test, the sample_opts argument, and the sample_args argument. For this example, it really doesn't matter what the test script (./test_script.sh) is designed to do. We just provide the options and arguments that we want to test, and that's all the unit_test.sh script needs to know. Listing 24 shows what happens when I run the test.

Listing 24

$ ./unit_test.sh ./test_script.sh sample_opts sample_args Debug mode Debug mode Debug mode Verbose mode Debug mode Verbose mode Debug mode Verbose mode Quiet mode The -v and -q options are conflicting. Debug mode Verbose mode Quiet mode The -v and -q options are conflicting. Quiet mode Quiet mode Verbose mode Verbose mod

Notice that the output from the unit test script shows that the -v and -q options cause a conflict. I have hard coded that error in the test script for clarity, but in everyday use you would have to look for things like real errors or output that doesn't match what is expected. The error about the -v and -q options makes sense in this case because you wouldn't want to run verbose (chatty) mode and quiet (non-chatty) mode at the same time. They are mutually exclusive options that should not be used together. This unit test script not only finds errors that I may miss with manual inspection, it allows you to easily recheck your script whenever you make a change, and ensures that your script is checked the same way every time.

There are a lot of improvements that can be made to this unit test script. For starters, the script doesn't check every possible combination of options. It's limited by the order that the options are in the sample_opts file. The script never reorders those options. Another improvement would be to have the script automatically check for common errors like illegal option, file not found, etc. As it stands now though, you can pipe the output of the script to grep in order to look for a specific error yourself.

Troubleshooting

The version of BASHDB that came with my chosen Linux distribution had a bug causing an error when a BASHDB function tried to return the value of -1. The problem went away though once I downloaded and compiled the latest version straight from the BASHDB website.

If a script you're debugging causes BASHDB to hang, you can try the CTRL+C key combination. This should exit from the script you're debugging and return you to the BASHDB prompt.

Conclusion

There are quite a few tools and methods at your disposal when debugging scripts. From BASH command line options, to a full debugger like BASHDB, to your own custom debugging and test scripts, there's a lot of room for creativity in making your scripts more error free. Better and more thorough debugging of your scripts from the outset will help lessen problems down the line, reducing down time and user frustration. In the future, I'll talk about handling runtime errors and security as the next steps in ensuring the quality and reliability of your shell scripts. Look for another post in this series soon.

Resources

  1. Expert Shell Scripting (Expert's Voice in Open Source) Book
  2. Learning the bash Shell: Unix Shell Programming (In a Nutshell (O'Reilly))
  3. BigAdmin Community Debugging Tip
  4. Shell Script Debugging Gotchas
  5. NixCraft Debugging Article
  6. Linux Journal, April 2010, Work The Shell, By Dave Taylor, "Our Twitter Autoresponder Goes Live!", pp 24-26
  7. The Linux Documentation Project Debugging Article
  8. BASHDB Homepage
  9. BASHDB Documentation
  10. Line Number Output In set -x Debugging Output
  11. 6 Cool Bash Tricks Article
  12. Using VIM as a BASH IDE
  13. General BASH Debugging Info
  14. Good Debugging Reference With Sample Error-Filled Scripts
  15. Good Debugging Tips Page By Bash-Hackers
  16. Modularizing The Debug Function To A Separate Script

Comments (15)

  1. [...] This post was mentioned on Twitter by . said: [...]

  2. [...] Writing Better Shell Scripts – Part 1 [...]

  3. Eric

    2010/06/16 at 12:57 PM

    Hi,

    Great Tutorial! I just installed bashdb on a Debian server to try some stuff out but when I tried the edit command it says Undefined command: “edit”. Try “help”.
    What am I missing?

    Kind regards,

    Eric

  4. Jeremy Mack Wright

    2010/06/16 at 1:43 PM

    Hi Eric,

    According to what I’m seeing in the bashdb 4.0-0.1 ChangeLog, the “edit” command wasn’t added until the 4.x series. The place that I would start is making sure that you don’t have an earlier version than 4.0-0.1. Also, downloading and compiling the latest 4.x version solved a problem that I was having while writing the blog post (see the Troubleshooting section). You probably don’t want to compile bashdb on your server (security concerns), but of course there are other ways to get the latest version.

    If you check and find that you’ve already got a 4.x version, we can dig a little deeper.

    Thanks for the question.

    Jeremy

  5. lail3344

    2010/06/18 at 12:05 AM

    Great work! I will using this sample in my bash program.

  6. [...] Writing Better Shell Scripts – Part 1 | Innovations Technology Solutions | Blog (tags: article linux tutorials scripts bash howto scripting development debug programming shell ubuntu unix sysadmin reference error script toread tutorial) Leave a Comment [...]

  7. [...] See original here: Writing Better Shell Scripts – Part 1 | Innovations Technology … [...]

  8. [...] Writing Better Shell Scripts – Part 1 [...]

  9. rocky

    2010/07/09 at 5:37 AM

    Great tutorial and video.

    Some smallish comments.

    1. Display command

    In the video if you want to track the value of num without having to edit the program, you can do this using the command:

    display echo $num

    display is also a gdb command, but it is a little different there. In bashdb you need to give a statement like “echo $num” rather than an expression. In the next bashdb which works only in bash 4.1, I have corrected the help text for that to make it clear this is a statement and not an expression.

    2. Source text

    Strictly speaking bashdb just shows the source text for the line number you are stopped at and generally each source line is roughly a statement. When a source line has several statements in it and you are not stopped at the first one, then that specific part is shown in addition to the other information generally shown. For example

    (/tmp/b.sh:1):
    1: x=1; y=2; z=3
    bashdb step
    (/tmp/b.sh:1):
    1: x=1; y=2; z=3
    y=2
    bashdb

    In the next bashdb release, “info program” will show you what bash
    reports as the next part to be executed:

    bashdb info program
    Program stopped.
    It stopped after being stepped.
    Next statement to be run is:
    y=2

    3. EDITOR environment variable and vim

    I’m not a vim user so I don’t know much about that. However when I set my EDITOR environment variable to “vim”, I didn’t need to “set visual”. One can set not just the editor but options to pass to the editor in that EDITOR environment variable. On computers which are remote so connection speed is slow, I often set my editor this way:

    export EDITOR=’emacs -nw’

    This forces emacs to run in a curses mode (i.e. not under a window manager). So again, if there is a way to tell vim to start with visual or to run a profile which sets visual, starting in visual mode might be done by adjusting the EDITOR environment variable.

    But again, good work!

  10. Jeremy Mack Wright

    2010/07/09 at 9:42 AM

    Thanks rocky. The information you provided is really helpful.

    1. Display command: I’m glad you mentioned the “display” command. That’s much more efficient than editing the script whenever you want to follow something like a variable value.

    2. Source text: The difference between source text and statements is a subtle but important distinction. It’s easy to get a little careless with the terminology, but your example does a good job of showing why they’re different. I like the next statement addition to “info program” too.

    3. EDITOR: It looks like the fact that vim entered ex mode automatically is my fault. I thought that I had set the EDITOR variable to “vim”, but had missed it. On my laptop’s Ubuntu 9.10 install running bashdb 4.0-0.4 an unset EDITOR variable causes the equivalent of the “vim -e” or “ex” commands to be called, which puts vim directly into ex mode. Properly setting the EDITOR variable as you did will correct the issue.

    For everyone reading these comments, make sure to have a look at the bashdb website, which is item #8 in the Resources section above. There you’ll find information on new releases and full documentation. As I said in the blog post, I think bashdb is a great addition to anybody’s shell scripting toolbox. Try it out and see if you agree.

    Thanks again rocky.

  11. [...] See the article here: Writing Better Shell Scripts – Part 1 | Innovations Technology Solutions | Blog [...]

  12. [...] Writing Better Shell Scripts – Part 1 | Innovations Technology Solutions | Blog (innovationsts.com) [...]

  13. [...] Writing Better Shell Scripts – Part 1 | Innovations Technology Solutions | Blog (tags: linux shell scripting programming tutorial script howto scripts) [...]

  14. [...] Writing Better Shell Scripts – Part 1 [...]

  15. [...] shell scripting Filed under: Linux — 0ddn1x @ 2010-10-01 16:36:05 +0000 http://www.innovationsts.com/blog/?p=1395 http://www.innovationsts.com/blog/?p=1896 http://www.innovationsts.com/blog/?p=2363 Leave a [...]

The comments are now closed.