Section 18.2. Debugging

18.2. Debugging

Since Python's development cycle is so fast, the most effective way to debug is often to edit your code so that it outputs relevant information at key points. Python has many ways to let your code explore its own state in order to extract information that may be relevant for debugging. The inspect and traceback modules specifically support such exploration, which is also known as reflection or introspection.

Once you have obtained debugging-relevant information, the print statement is often the simplest way to display it. You can also log debugging information to files. Logging is particularly useful for programs that run unattended for a long time, such as server programs. Displaying debugging information is just like displaying other kinds of information, as covered in Chapters 10 and 17. Logging such information is mostly like writing to files (as covered in Chapter 10) or otherwise persisting information, as covered in Chapter 11; however, to help with the specific task of logging, Python's standard library also supplies a logging module, covered in "The logging module" on page 136. As covered in excepthook on page 168, rebinding attribute excepthook of module sys lets your program log detailed error information just before your program is terminated by a propagating exception.

Python also offers hooks that enable interactive debugging. Module pdb supplies a simple text-mode interactive debugger. Other interactive debuggers for Python are part of integrated development environments (IDEs), such as IDLE and various commercial offerings. However, I do not cover IDEs in this book.

18.2.1. Before You Debug

Before you embark on possibly lengthy debugging explorations, make sure you have thoroughly checked your Python sources with the tools mentioned in Chapter 3. Such tools can catch only a subset of the bugs in your code, but they're much faster than interactive debugging, and so their use amply repays itself.

Moreover, again before starting a debugging session, make sure that all the code involved is well covered by unit tests, as seen at "Unit Testing and System Testing" on page 452. Once you have found a bug, before you fix it, add to your suite of unit tests (or, if needed, to the suite of system tests) a test or two that would have found the bug if they had been present from the start, and run the tests again to confirm that they now do reveal and isolate the bug; only once that is done should you proceed to fix the bug. By regularly following this procedure, you will soon have a much better suite of tests, learn to write better tests, and gain much sounder assurance about the overall correctness of your code.

Remember, even with all the facilities offered by Python, its standard library, and whatever IDEs you fancy, debugging is still hard. Take this fact into account even before you start designing and coding: write and run plenty of unit tests and keep your design and code simple, so as to reduce to the absolute minimum the amount of debugging you will need! The classic advice in this regard was phrased by Brian Kernighan as follows: "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it."

18.2.2. The inspect Module

The inspect module supplies functions to get information from all kinds of objects, including the Python call stack (which records all function calls currently executing) and source files. The most frequently used functions of module inspect are as follows.

getargspec, formatargspec
getargspec(f)

f is a function object. getargspec returns a tuple with four items: (arg_names, extra_args, extra_kwds, arg_defaults). arg_names is the sequence of names of f's parameters. extra_args is the name of the special parameter of the form *args, or None if f has no such parameter. extra_kwds is the name of the special parameter of the form **kwds, or None if f has no such parameter. arg_defaults is the tuple of default values for f's arguments. You can deduce other details of f's signature from getargspec's results: f has len(arg_names)-len(arg_defaults) mandatory parameters, and the names of f's optional parameters are the strings that are the items of the list slice arg_names[-len(arg_defaults):].

formatargspec accepts one to four arguments that are the same as the items of the tuple that getargspec returns, and returns a string with this information. Thus, formatargspec(*getargspec(f)) returns a string with f's parameters (i.e., f's signature) in parentheses, as used in the def statement that created f. For example:

import inspect def f(a,b=23,**c): pass print inspect.formatargspec(*inspect.getargspec(f)) # emits: (a, b=23, **c)

getargvalues, formatarg-values
getargvalues(f)

f is a frame objectfor example, the result of a call to the function _getframe in module sys (covered in "_getframe") or to function currentframe in module inspect. getargvalues returns a tuple with four items: (arg_names, extra_args, extra_kwds, locals). arg_names is the sequence of names of f's function's parameters. extra_args is the name of the special parameter of form *args, or None if f's function has no such parameter. extra_kwds is the name of the special parameter of form **kwds, or None if f's function has no such parameter. locals is the dictionary of local variables for f. Since arguments, in particular, are local variables, the value of each argument can be obtained from locals by indexing the locals dictionary with the argument's corresponding parameter name.

formatargvalues accepts one to four arguments that are the same as the items of the tuple that getargvalues returns, and returns a string with this information. formatargvalues(*getargvalues(f)) returns a string with f's arguments in parentheses, in named form, as used in the call statement that created f. For example:

def f(x=23): return inspect.currentframe( ) print inspect.formatargvalues(inspect.getargvalues(f( ))) # emits: (x=23)

currentframe
currentframe( )

Returns the frame object for the current function (caller of currentframe). formatargvalues(getargvalues(currentframe( ))), for example, returns a string with the arguments of the calling function.

getdoc
getdoc(obj)

Returns the docstring for obj, with tabs expanded to spaces and redundant whitespace stripped from each line.

getfile, getsourcefile
getfile(obj)

Returns the name of the file that defined obj and raises TypeError when unable to determine the file. For example, getfile raises TypeError if obj is built-in. getfile returns the name of a binary or source file. getsourcefile returns the name of a source file and raises TypeError when all it can find is a binary file, not the corresponding source file.

getmembers
getmembers(obj, filter=None)

Returns all attributes (members), both data and methods (including special methods), of obj, a sorted list of (name,value) pairs. When filter is not None, returns only attributes for which callable filter is true when called on the attribute's value, like:

sorted((n, v) for n, v in getmembers(obj) if filter(v))

getmodule
getmodule(obj)

Returns the module object that defined obj, or None if it is unable to determine it.

getmro
getmro(c)

Returns a tuple of bases and ancestors of class c in method resolution order. c is the first item in the tuple. Each class appears only once in the tuple. For example:

class oldA: pass class oldB(oldA): pass class oldC(oldA): pass class oldD(oldB,oldC): pass for c in inspect.getmro(oldD): print c._ _name_ _, # emits: oldD oldB oldA oldC class newA(object): pass class newB(newA): pass class newC(newA): pass class newD(newB,newC): pass for c in inspect.getmro(newD): print c._ _name_ _, # emits: newD newB newC newA object

getsource, getsourcelines
getsource(obj)

Returns one multiline string that is the source code for obj, and raises IOError if it is unable to determine or fetch it. getsourcelines returns a pair: the first item is the source code for obj (a list of lines), and the second item is the line number of first line.

isbuiltin, isclass, iscode, isframe, isfunction, ismethod, ismodule, isroutine
isbuiltin(obj)

Each of these functions accepts a single argument obj and returns TRue if obj belongs to the type indicated in the function name. Accepted objects are, respectively: built-in (C-coded) functions, class objects, code objects, frame objects, Python-coded functions (including lambda expressions), methods, modules, and, for isroutine, all methods or functions, either C-coded or Python-coded. These functions are often used as the filter argument to getmembers.

stack
stack(context=1)

Returns a list of six-item tuples. The first tuple is about stack's caller, the second tuple is about the caller's caller, and so on. Each tuple's items, in order, are: frame object, filename, line number, function name, list of context source code lines around the current line, and index of current line within the list.

18.2.2.1. An example of using inspect

Suppose that somewhere in your program you execute a statement such as:

x.f( )

and unexpectedly receive an AttributeError informing you that object x has no attribute named f. This means that object x is not as you expected, so you want to determine more about x as a preliminary to ascertaining why x is that way and what you should do about it. Change the statement to:

try: x.f( )
except AttributeError:
    import sys, inspect
    print>>sys.stderr, 'x is type %s, (%r)' % (type(x), x)
    print>>sys.stderr, "x's methods are:",
    for n, v in inspect.getmembers(x, callable):
        print>>sys.stderr, n,
    print>>sys.stderr
    raise

This example uses sys.stderr (covered in stdin, stdout, stderr on page 171), since it displays information related to an error, not program results. Function getmembers of module inspect obtains the name of all the methods available on x in order to display them. If you need this kind of diagnostic functionality often, package it up into a separate function, such as:

import sys, inspect def show_obj_methods(obj, name, show=sys.stderr.write):
    show('%s is type %s(%r)\n'%(name,obj,type(obj)))
    show("%s's methods are: "%name)
    for n, v in inspect.getmembers(obj, callable):
       show('%s '%n)
    show('\n')

And then the example becomes just:

try: x.f( )
except AttributeError:
    show_obj_methods(x, 'x')
    raise

Good program structure and organization are just as necessary in code intended for diagnostic and debugging purposes as they are in code that implements your program's functionality. See also "The _ _debug_ _ built-in variable" on page 138 for a good technique to use when defining diagnostic and debugging functions.

18.2.3. The traceback Module

The TRaceback module lets you extract, format, and output information about tracebacks as normally produced by uncaught exceptions. By default, module traceback reproduces the formatting Python uses for tracebacks. However, module TRaceback also lets you exert fine-grained control. The module supplies many functions, but in typical use you need only one of them.

print_exc
print_exc(limit=None, file=sys.stderr)

Call print_exc from an exception handler or a function directly or indirectly called by an exception handler. print_exc outputs to file-like object file the traceback information that Python outputs to stderr for uncaught exceptions. When limit is not None, print_exc outputs only limit TRaceback nesting levels. For example, when, in an exception handler, you want to cause a diagnostic message just as if the exception propagated, but actually stop the exception from propagating any further (so that your program keeps running and no further handlers are involved), call TRaceback.print_exc( ).

18.2.4. The pdb Module

The pdb module exploits the Python interpreter's debugging and tracing hooks to implement a simple command-line-oriented interactive debugger. pdb lets you set breakpoints, single-step on sources, examine stack frames, and so on.

To run some code under pdb's control, import pdb and then call pdb.run, passing as the single argument a string of code to execute. To use pdb for post-mortem debugging (meaning debugging of code that just terminated by propagating an exception at an interactive prompt), call pdb.pm( ) without arguments. When pdb starts, it first reads text files named .pdbrc in your home directory and in the current directory. Such files can contain any pdb commands, but most often they use the alias command in order to define useful synonyms and abbreviations for other commands.

When pdb is in control, it prompts you with the string '(Pdb) ', and you can enter pdb commands. Command help (which you can also enter in the abbreviated form h) lists all available commands. Call help with an argument (separated by a space) to get help about any specific command. You can abbreviate most commands to the first one or two letters, but you must always enter commands in lowercase: pdb, like Python itself, is case-sensitive. Entering an empty line repeats the previous command. The most frequently used pdb commands are the following.

!
! statement

Executes Python statement statement in the currently debugged context.

alias, unalias
alias [ name [ command ] ]

alias without arguments lists currently defined aliases. alias name outputs the current definition of the alias name. In the full form, command is any pdb command, with arguments, and may contain %1, %2, and so on to refer to specific arguments passed to the new alias name being defined, or %* to refer to all such arguments together. Command unalias name removes an alias.

args, a
args

Lists all actual arguments passed to the function you are currently debugging.

break, b
break [ location [ ,condition ] ]

break without arguments lists currently defined breakpoints and the number of times each breakpoint has triggered. With an argument, break sets a breakpoint at the given location. location can be a line number or a function name, optionally preceded by filename: to set a breakpoint in a file that is not the current one or at the start of a function whose name is ambiguous (i.e., a function that exists in more than one file). When condition is present, it is an expression to evaluate (in the debugged context) each time the given line or function is about to execute; execution breaks only when the expression returns a true value. When setting a new breakpoint, break returns a breakpoint number, which you can then use to refer to the new breakpoint in any other breakpoint-related pdb command.

clear, cl
clear [ breakpoint-numbers ]

Clears (removes) one or more breakpoints. clear without arguments removes all breakpoints after asking for confirmation. To deactivate a breakpoint without removing it, see disable on page 468.

condition
condition breakpoint-number [ expression ]

condition n expression sets or changes the condition on breakpoint n. condition n, without expression, makes breakpoint n unconditional.

continue, c, cont
continue

Continues execution of the code being debugged, up to a breakpoint, if any.

disable
disable [ breakpoint-numbers ]

Disables one or more breakpoints. disable without arguments disables all breakpoints (after asking for confirmation). This differs from clear in that the debugger remembers the breakpoint, and you can reactivate it via enable.

down, d
down

Moves down one frame in the stack (i.e., toward the most recent function call). Normally, the current position in the stack is at the bottom (i.e., at the function that was called most recently and is now being debugged). Therefore, command down can't go further down. However, command down is useful if you have previously executed command up, which moves the current position upward.

enable
enable [ breakpoint-numbers ]

Enables one or more breakpoints. enable without arguments enables all breakpoints after asking for confirmation.

ignore
ignore breakpoint-number [ count ]

Sets the breakpoint's ignore count (to 0 if count is omitted). Triggering a breakpoint whose ignore count is greater than 0 just decrements the count. Execution stops, presenting you with an interactive pdb prompt, when you trigger a breakpoint whose ignore count is 0. For example, say that module fob.py contains the following code:

def f( ): for i in range(1000): g(i) def g(i): pass

Now consider the following interactive pdb session (in Python 2.4; minor details may change depending on the Python version you're running):

>>> import pdb >>> import fob >>> pdb.run('fob.f( )') > <string>(1)?( ) (Pdb) break fob.g Breakpoint 1 at C:\mydir\fob.py:5 (Pdb) ignore 1 500 Will ignore next 500 crossings of breakpoint 1. (Pdb) continue > C:\mydir\fob.py(5)g( ) -> pass (Pdb) print i 500

The ignore command, as pdb says, asks pdb to ignore the next 500 hits on breakpoint 1, which we set at fob.g in the previous break statement. Therefore, when execution finally stops, function g has already been called 500 times, as we show by printing its argument i, which indeed is now 500. The ignore count of breakpoint 1 is now 0; if we give another continue and print i, i will show as 501. In other words, once the ignore count decrements to 0, execution stops every time the breakpoint is hit. If we want to skip some more hits, we must give pdb another ignore command, setting the ignore count of breakpoint 1 at some value greater than 0 yet again.

list, l
list [ first [ , last ] ]

list without arguments lists 11 lines centered on the current one, or the next 11 lines if the previous command was also a list. Arguments to the list command can optionally specify the first and last lines to list within the current file. The list command lists physical lines, including comments and empty lines, not logical lines.

next, n
next

Executes the current line, without stepping into any function called from the current line. However, hitting breakpoints in functions called directly or indirectly from the current line does stop execution.

print, p
p expression

Evaluates expression in the current context and displays the result.

quit, q
quit

Immediately terminates both pdb and the program being debugged.

return, r
return

Executes the rest of the current function, stopping only at breakpoints if any.

step, s
step

Executes the current line, stepping into any function called from the current line.

tbreak
tbreak [ location [ ,condition ] ]

Like break, but the breakpoint is temporary (i.e., pdb automatically removes the breakpoint as soon as the breakpoint is triggered).

up, u
up

Moves up one frame in the stack (i.e., away from the most recent function call and toward the calling function).

where, w
where

Shows the stack of frames and indicates the current one (i.e., in which frame's context command ! executes statements, command args shows arguments, command print evaluates expressions, etc.).

18.2.5. Debugging in IDLE

IDLE, the Interactive DeveLopment Environment that comes with Python, offers debugging functionality similar to that of pdb, although not quite as powerful. Thanks to IDLE's GUI, the functionality is easier to access. For example, instead of having to ask for source lists and stack lists explicitly with such pdb commands as list and where, you just activate one or more of four checkboxes in the Debug Control window to see source, stack, locals, and globals always displayed in the same window at each step.

To start IDLE's interactive debugger, use Debug Debugger in IDLE's *Python Shell* window. IDLE opens the Debug Control window, outputs [DEBUG ON] in the shell window, and gives you another >>> prompt in the shell window. Keep using the shell window as you normally would; any command you give at the shell window's prompt now runs under the debugger. To deactivate the debugger, use Debug Debugger again; IDLE then toggles the debug state, closes the Debug Control window, and outputs [DEBUG OFF] in the shell window. To control the debugger when the debugger is active, use the GUI controls in the Debug Control window. You can toggle the debugger away only when it is not busy actively tracking code; otherwise, IDLE disables the Quit button in the Debug Control window.