21.1. CGI in PythonThe CGI standard lets you use any language to code CGI scripts. Python is a very high-level, high-productivity language, and thus quite suitable for CGI coding. The Python standard library supplies modules to handle typical CGI-related tasks. 21.1.1. Form Submission MethodsCGI scripts often handle submitted HTML forms. In this case, the action attribute of the form tag specifies the URL for a CGI script to handle the form, and the method attribute is GET or POST, indicating how the form data is sent to the script. According to the CGI standard, the GET method should be used only for forms without side effects, such as asking the server to query a database and display results, while the POST method is meant for forms with side effects, such as asking the server to update a database. In practice, however, GET is also often used to create side effects. The distinction between GET and POST in practical use is that GET encodes the form's contents as a query string joined to the action URL to form a longer URL, while POST transmits the form's contents as an encoded stream of data, which a CGI script sees as standard input. GET is slightly faster. You can use a fixed GET-form URL wherever you can use a hyperlink. However, GET cannot send large amounts of data to the server, since many clients and servers limit URL lengths (you're safe up to about 200 bytes). The POST method has no size limits. You must use POST when the form contains input tags with type=filethe form tag must then have enctype=multipart/form-data. The CGI standard does not specify whether a single script can access both the query string (used for GET) and the script's standard input (used for POST). Many clients and servers let you get away with it, but relying on this nonstandard practice may negate the portability advantages that you would otherwise get from the fact that CGI is a standard. Python's standard module cgi (covered in the next section) recovers form data only from the query string, when any query string is present; otherwise, when no query string is present, cgi recovers form data from standard input. 21.1.2. The cgi ModuleThe cgi module supplies one function and one class that your CGI scripts use often.
21.1.3. CGI Output and ErrorsWhen the server runs a CGI script to meet a request, the response to the request is the standard output of the script. The script must output HTTP headers, then an empty line, then the response's body. In particular, the script must always output a Content-Type header. Most often, the script outputs the Content-Type header as: Content-Type: text/html In this case, the response body must be HTML. However, the script may also choose to output a content type of text/plain (i.e., the response body must be plain text), or any other MIME type followed by a response body that conforms to that MIME type. The MIME type must be compatible with the Accept header that the client sent, if any. Here is the simplest possible Python CGI script in the tradition of "Hello World," ignoring its input and outputting just one line of plain text output: print "Content-Type: text/plain" print print "Hello, CGI World!" Most often, you want to output HTML, and this is similarly easy: print "Content-Type: text/html" print print "<html><head><title>Hello, HTML</title></head>" print "<body><p>Hello, CGI and HTML together!</p></body></html>" Browsers are quite forgiving in parsing HTML: you could get by without the HTML structure tags that this code outputs. However, being fully correct costs little. For some other ways to generate HTML output, see "Generating HTML" on page 586. The web server collects all output from a CGI script, then sends it to the client browser in one gulp. Therefore, you cannot send to the client progress information, just final results. If you need to output binary data (on a platform where binary and text files differ, i.e., Windows), you must ensure python is called with the -u switch, covered in "Command-Line Syntax and Options" on page 23. A more robust approach is to text-encode your output, using the encoding modules covered in "Encoding Binary Data as Text" on page 561 (typically with Base-64 encoding) and a suitable Content-Transfer-Encoding header. A standard-compliant browser then decodes your output according to the Content-Transfer-Encoding header and recovers the binary data you encoded. Encoding enlarges output by about 30 percent, which sometimes gives performance problems. In such cases, it's better to ensure that your script's standard output stream is a binary file. To ensure binary output on Windows, here is an alternative to the -u switch: try: import msvcrt, os except ImportError: pass else: msvcrt.setmode(1, os.OS_BINARY) 21.1.3.1. Error messagesIf exceptions propagate from your script, Python outputs traceback diagnostics to standard error. With most web servers, error information ends up in error logs. The client browser receives a concise generic error message. This may be okay, if you can access the server's error logs. Seeing detailed error information in the client browser, however, makes your life easier when you debug a CGI script. When you think that a script may have bugs, and you need an error trace for debugging, you can use a content type of text/plain and redirect standard error to standard output, as shown here: print "Content-Type: text/plain" print import sys sys.stderr = sys.stdout def witherror( ): return 1/0 print "Hello, CGI with an error!" print "Trying to divide by 0 produces:",witherror( ) print "The script does not reach this part..." If your script fails only occasionally and you want to see HTML-formatted output up to the point of failure, you could also use a more sophisticated approach based on the traceback module covered in "The traceback Module" on page 466, as shown here: import sys sys.stderr = sys.stdout import traceback print "Content-Type: text/html" print try: def witherror( ): return 1/0 print "<html><head><title>Hello, traceback</title></head> <body>" print "<p>Hello, CGI with an error traceback!" print "<p>Trying to divide by 0 produces:",witherror( ) print "<p>The script does not reach this part..." except ZeroDivisionError: print "<br><strong>ERROR detected:</strong><br><pre>" traceback.print_exc( ) sys.stderr = sys._ _stderr_ _ traceback.print_exc( ) After imports, redirection, and content-type output, this example runs the script's substantial part in the TRy clause of a TRy/except statement. In the except clause, the script outputs a <br> tag, terminating any current line, and then a <pre> tag to ensure that further line breaks are honored. Function print_exc of module TRaceback outputs all error information. Lastly, the script restores standard error and outputs error information again. Thus, the information is also in the error logs for later study, not just transiently displayed in the client browser: not very useful in this specific example, since the error is repeatable, but necessary to track down real-life errors. 21.1.3.2. The cgitb moduleThe simplest way to provide good error reporting in CGI scripts, although not quite as flexible as the approach just outlined in the previous section, is to use module cgitb. Module cgitb supplies two functions.
21.1.4. Installing Python CGI ScriptsInstallation of CGI scripts depends on the web browser and host platform. A script coded in Python is no different in this respect from scripts coded in other languages. Of course, you must ensure that the Python interpreter and standard library are installed and accessible. On Unix-like platforms, you must set the x permission bits for the script and use a so-called shebang line as the script's first linefor example: #!/usr/local/bin/python depending on the details of your platform and Python installation. If you copy or share files between Unix and Windows platforms, make sure the shebang line does not end with a carriage return (\r), which might confuse the shell or web server that parses the shebang line to find out which interpreter to use for your script. 21.1.4.1. Python CGI scripts on Microsoft web serversIf your web server is Microsoft IIS or Microsoft PWS (Personal Web Server), assign file extensions to CGI scripts via entries in registry path HKLM\System\CurrentControlSet\Services\W3Svc\Parameters\Script_Map. Each value in this path is named by a file extension, such as .pyg (value names start with a period). The value is the interpreter command (e.g., C:\Python24\Python.Exe -u %s %s). You may use file extensions such as .cgi or .py for this purpose, but I recommend a unique one such as .pyg instead. Assigning Python as the interpreter for all scripts named .cgi might interfere with your ability to use other interpreters for CGI purposes. Having all modules with a .py extension interpreted as CGI scripts is more accident-prone than dedicating a unique extension such as .pyg to this purpose and may interfere with your ability to have your Python-coded CGI scripts import modules from the same directories. With IIS 5 and later, you can use the Administrative Tools Computer Management applet to associate a file extension with an interpreter command line. This is performed via Services and Applications Internet Information Services. Right-click either on [IISAdmin], for all sites, or on a specific web site, and choose Properties Configuration Add Mappings Add. Enter the extension, such as .pyg, in the Extension field, and the interpreter command line, such as C:\Python22\Python.Exe -u %s %s, in the Executable field. 21.1.4.2. Python CGI scripts on ApacheThe popular free web server Apache is configured via directives in a text file (by default, httpd.conf). When the configuration has ScriptAlias entries, such as: ScriptAlias /cgi-bin/ /usr/local/apache/cgi-bin/ any executable script in the aliased directory can run as a CGI script. You may enable CGI execution in a specific directory by using for the Apache directive for that directory: Options +ExecCGI In this case, to let scripts with a certain extension run as CGI scripts, you may also add a global AddHandler directive, such as: AddHandler cgi-script pyg to enable scripts with extension .pyg to run as CGI scripts. Apache determines the interpreter to use for a script by the shebang line at the script's start. Another way to enable CGI scripts in a directory (if global directive AllowOverride Options is set) is to use Options +ExecCGI in a file named .htaccess in that directory. 21.1.4.3. Python CGI scripts on XitamiThe free, lightweight, simple web server Xitami (http://www.xitami.org) makes it easy to install CGI scripts. When any component of a URL is named cgi-bin, Xitami takes the URL as a request for CGI execution. Xitami determines the interpreter to use for a script by the shebang line at the script's start, even on Windows platforms. |