Using the Servlet API

The most common use of servlets is for handling HTTP requests in Web and application servers. Now we'll look at using the functions we discussed in the last section. We'll also build on the first servlet we created in the chapter.

HTTP Servlet Skeleton

Table 14.1 lists all the methods that are available in an HttpServlet interface for handling different types of requests. But the most used are the doPost() and the doGet() methods, which handle the servlet requests via the service method. Listing 14.4 shows a skeleton of a simple HTTP servlet that can handle both GET and POST requests. In the next section of the chapter, we develop a simple servlet based on this skeleton to convert a given string to its uppercase equivalent.

Listing 14.4 Simple HTTP Servlet Skeleton

import java.io.*;
import javax.servlet.*;
import javax.servlet.http.*;
public class SimpleServlet extends HttpServlet {

       public void init(ServletConfig config)
       {
           // read init parameters from the config
           // initialize common resources like database
       }

       //Request handling methods
       public void doGet(HttpServletRequest request,
                         HttpServletResponse response)
                        throws ServletException, IOException
       {
            // Request Handling Logic
            // Format the output
            PrintWriter out = response.getWriter();
            // Use "out" to send output to web browser
       }

       public void doPost(HttpServletRequest request,
                          HttpServletResponse response)
                          throws ServletException, IOException
       {
            // Request Handling Logic
            // Call the doGet() method to process the POST request
             doGet(request,response);
       }

       public void destroy()
       {
           // clean up operations
       }

}

The preceding servlet template introduces some key concepts that are common to all HttpServlets. The service processing methods doGet() and doPost() both take the HttpServletRequest and HttpServletResponse objects as arguments. The HttpServletRequest object is for reading HTTP headers and user input. It provides a host of methods as defined in Table 14.5. The following listing gives a sample output for some of the methods of the HttpServletRequest object with respect to the SimpleServlet:

Request method: GET
Request URI: /wlsUnleashed/SimpleServlet2
Request protocol: HTTP/1.1
Servlet path: /SimpleServlet2
Path info: <none>
Path translated: <none>
Server name: localhost
Server port: 7001
Remote address: 127.0.0.1
Remote host: 127.0.0.1
Scheme: http
Request scheme: http
Requested URL: http://localhost:7001/wlsUnleashed/SimpleServlet2

The HttpServletResponse object is used for sending the response header, which includes the HTTP status for the request. A typical WebLogic Server HTTP response header looks like this:

[View full width]
HTTP/1.1 200 OK
Date: Tue, 06 Apr 2003 21:59:48 GMT
Server: WebLogic WebLogic Server 8.1 Thu Mar 20 23:06:05 PST 2003 246620
Content-Length: 98
Content-Type: text/plain
Connection: Keep-Alive
X-WebLogic-Cluster-List: 1732628108!-1408104675!7001!7002|195540992!-1408104674!
7001!7002|533730093!-1408104657!7001!7002|658650126!-1408104656!7001!7002
X-WebLogic-Cluster-Hash: vnE9rA7Fu4xeOQAoQUfIHPDAXro
Set-Cookie:
JSESSIONID=9QGUOYCxNEOAVJNQPmS0MQs22yW7TslJfiBZps0ZwpqP3EXqQloD!1732628108!-1408104675!7001!7002;
 path=/

The header contains information such as the status of the request, WebLogic version, time of request, content type, and length. The status of the request appears on the first line (200 indicates success) along with the protocol and version used. Some of the valid statuses were listed earlier in the chapter in Table 14.6. We'll now take a detailed look at the exceptions thrown by the request-handling routines doGet() and doPost(). Both the doGet() function and the doPost() function throw ServletException and IOException, which are explained earlier in the chapter.

Using doGet() and doPost()

It isn't good practice to code the service() method as given in the SimpleServlet for handling all types of requests; that is, both GET requests and POST requests. The temptation for such a practice is usually driven by one of the following:

Simplicity of the function that's implemented, like the SimpleServlet
All the request types— POST, GET, and so on—have to be handled in the same fashion

But this practice should be avoided for better code organization and for extensibility. Even though their function might be identical, if the POST and GET requests are coded separately, it keeps open the option of overriding all other functions defined in Table 14.2.

The code in Listing 14.1 can be converted to use the doPost() and the doGet() methods. Listing 14.5 is the full listing of the modified simple servlet, which contains the init() method and the doGet() and the doPost() methods.

Listing 14.5 Modified Simple Servlet

package wlsunleashed.servlets;

import javax.servlet.*;
import javax.servlet.http.*;
import java.io.*;

public class SimpleServlet extends HttpServlet
{
     private String defaultString="HELLO WORLD";

     public void init(ServletConfig config) throws ServletException
     {
        super.init(config);

        // Check for all initialization parameters
        //Read the INPUT_STRING parameter.

        if ((defaultString= getInitParameter("DEFAULT_STRING")) == null)
            defaultString = "HELLO WORLD";

    }

     // Handles the GET request
     public void doGet(HttpServletRequest req, HttpServletResponse res)
        throws IOException
    {

         String convertedString,inputString;

        //read the request parameter INPUT_STRING
       // If the input String is null
       // it returns "hello world" In uppercase

        if ((inputString = req.getParameter("INPUT_STRING"))
            != null) {
            convertedString = inputString.toUpperCase();
        }
        else {
            convertedString =defaultString ;
        }

        // Set the content type first
        res.setContentType("text/html");

        // get the PrintWriter
        PrintWriter out = res.getWriter();

        out.println("<html><head><title>SimpleServlet</title></head>");
        out.println("<body>");
        out.println("<h1>");

        out.println("The input string upper case(d):" + convertedString );
        out.println("</h1></body></html>");

    }

     // Handles the POST request
    // Since the function is too simple and does the same function
    // we will call doGet from doPost
     public void doPost(HttpServletRequest req, HttpServletResponse res)
        throws IOException
    {
        doGet(req,res);
    }
}

Because the functionality is not that complex, either the doGet() or the doPost() can be implemented and the other function can call the implemented function as shown earlier. Because this servlet can handle both HTTP GET and POST requests, the question arises as to which request type is better. They differ in the way that they transfer parameters to the server. GET requests send parameters in a URL string, and POST requests send parameters in the request body. POST requests are typically used for forms because one of the disadvantages of URL strings is the amount of data that can be passed to the server. The doGet() and doPost() methods are for convenience only because most Web servers parse the request and group the parameters in collections for the servlet by the time the doPost() or the doGet() is invoked. But the most obvious advantage is the capability to separate logic to handle different calls to the servlets.

TIP

When you have the same processing logic for both doGet and doPost methods, you put the processing logic in one and call it from the other one. Also, the processing logic should be implemented in the method that will be used most.

In this section, we covered a basic servlet implementation. Advanced servlet features such as session tracking, dispatching requests to other servlets, JSPs, and cookies, are discussed in detail later. Servlets also provide an elegant mechanism for accessing other J2EE services, including EJBs and XML Web Services, which are explained in detail as well.

Handling Request and Response

Two of the important things we haven't covered so far are the way in which inputs are extracted from the request and the different ways the responses are constructed. The SimpleServlet discussed earlier showed one of the most common mechanism for handling the user input and the servlet output.

Extracting User Inputs

We briefly looked at the functions used to retrieve the input from the user during the discussions of the Servlet API and in the SimpleServlet example. An HTTP request from the user, usually from the browser, contains information such as the query parameters and session-related parameters such as cookies, encoded URL, and more. The user request also identifies the HTTP request type, which the service() method can use to dispatch the request to the corresponding doXXX() method. GET and POST are the most common HTTP request types used. The GET method is used when the parameters are embedded directly in a URL, and data is POSTed to the server when it is embedded in the body of the request.

Request as Parameters

User parameters are embedded in the HttpServletRequest object in the form of key-value pairs. The key-value pairs can be in any order and can be extracted using a variety of ways.

Some of them were listed briefly in the discussion of the Servlet API. We'll look at these functions with respect to the servlet example we saw earlier. The key of the input is "INPUT_STRING" and the value is the string to be converted. The following code snippet reads the key:

if ((inputString = req.getParameter("INPUT_STRING")) != null) {
            convertedString = inputString.toUpperCase();
        }

If we extend the example to read multiple strings and convert them to uppercase letters, all input strings can be sent with the same key, repeating once for each string sent to the servlet. The getParameterValues() function takes a string that represents the key and returns an array of strings that maps to the values for the given key. Similarly, an Enumeration of all keys can be retrieved using getParameterNames().

Request in the Body

When the request is embedded in the body of the request, either the parameter functions defined earlier can be used or the data can be read as raw data. There are two ways to read binary data sent by the client using the methods provided by the HttpServletRequest:

Using the getInputStream() method. If we consider our earlier example—SimpleServlet—the code could be modified to read the input as raw data as follows:

// Get the input stream and read the data...
int length = request.getContentLength();
ServletInputStream in = request.getInputStream();
if( length != -1 ){
// read in a fixed number of bytes
} else {
                //read until end of file
}

Using the getReader() method. This method, which was introduced in Servlet 2.0, provides an alternative to the getInputStream() method.

While using the getInputStream() function, care should be taken to read no more data than the content length that can be obtained from the getContentLength() method. Otherwise, the servlet behavior is unknown. In the case of getReader(), the BufferedReader returned from the getReader() call takes care of the content length and reads the data accordingly.

Creating a Response

Let's now look at different ways to create a response to send back to the client. The HttpServletResponse object that's passed as an argument to the service and the doXXX() methods provide the methods and the stream to create the output and send it to the client. The response object can be used to set cookies and/or encode URLs for session management along with the client output, which might be HTML content or even simple text. The response created can be grouped into a properties section and a generated user response section. The properties section includes setting predefined headers such as content type, length, encoding, pragma, and so forth.

Specifying Content Type and Length

The type of content is a special header that must be set by the user, and the HttpServletResponse interface provides a method for doing so. setContentType() has to be set before the output stream or the print writer where the rest of the user response is written. For SimpleServlet, where the response is an HTML message, the content type is set to text/html as shown in the following code snippet:

// Set the content type first
res.setContentType("text/html");

Content types are required by the client to determine what kind of data is coming back from the server and to launch the appropriate application to view the content (if needed). For a browser client, the default is usually text/html. Some of the most common MIME types are

application/pdf— Acrobat files
application/msword— Word documents
application/zip— Zip file
text/css— Cascading Style Sheet
text/plain— Simple text
text/gif— GIF image
text/jpeg— JPEG image

For a complete official list of registered MIME types, refer to the following site: http://www.isi.edu/in-notes/iana/assignments/media-types/media-types. The content type tells the client what application the client has to launch to view the response from the servlet.

Content Headers

During the discussion of the HttpServletResponse interface in the API section, we briefly covered the functions that are available for setting the headers in the response. The functions setHeader(), setIntHeader(), and setDateHeader() add new header values to the response. If the header already exists, these functions overwrite the existing value. Because HTTP allows for multiple occurrences of the same header parameter, the HttpServletResponse object adds the functions addHeader(), addIntHeader(), and addDateHeader(), which correspond to the previously mentioned set functions. Table 14.7 lists some of the most common header types. No header is mandatory, no header is prohibited, and each header can be interpreted by the browser in any number of ways, but the HTTP specification prescribes a few standard headers with intended browser behavior.

Table 14.7. Header Types
Type
Description
Cache-Control
Instructs the client as to whether the document received can be cached. Some valid values are

public: Cacheable for all users
private: For a single user
no-cache: Should not be cached at all
max-age=n: Document becomes invalid n seconds after the document is received by the browser

Other valid values are no-store (document is neither cached nor stored), must-revalidate, and proxy-revalidate (validate the document on the server or the proxy).
Expires
max-age header is similar to the Cache-Control header, except this defines a date and time after which the document should be invalidated. This sets an absolute time, and max-age sets a relative time. The setDateHeader() can be used to set the expiry time. If the max-age value is set, it takes precedence over Expires.
Connection
This header is used for managing HTTP connections between the server and client, which may be persistent or non-persistent in nature depending on the value. When the value is keep-alive, the client keeps the connection alive with the server. If close is the value, a new connection is opened for every request. By default, all HTTP 1.1–compliant clients use persistent connections. When using persistent connections, content length should be set using the setContentLength() to indicate the length of the response sent.
Refresh
This parameter tells the browser to automatically refresh the document in n seconds. To set this parameter, we can either use the setHeader or setIntHeader function. For example, setIntHeader("Refresh", 20). Additionally, a new URL can also be specified when the refresh is performed. For example, setHeader("Refresh","30;URL=http://www.sams.com").

The refresh tag, an extension supported by both IE and Netscape, is usually set in the HEAD section of the HTML page instead of using this header.
If-Modified-Since
This request-header field is used in a method to conditionally send the output/entity. In other words, the response to a client request will not contain the complete output if the requested entity has not been modified since the time specified in this field.
Last-Modified
This date header indicates the last time the document was modified.
Date
This header sets the date of the request usually in GMT format. The setDateHeader() function can be used for doing this. WebLogic Server sets this date automatically.

NOTE

The If-Modified-Since header handles document refreshing efficiently by downloading only when the Last-Modified is greater than the If-Modified-Since.

The Cache-Control header is one of the new headers defined in HTTP 1.1. For detailed description on how the caching is implemented by the HTTP protocol refer to the following links: http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html and http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html. This will give an insight into how caching is implemented by the browsers and the Web servers. For clients that support only HTTP 1.0, use the Pragma header for enabling/disabling caching. Servlets should typically include the following code snippet for backward compatibility:

// Cache Control
res.setHeader("Cache-Control","no-cache");
res.setHeader("Pragma","no-cache");

The Set-Cookie tag is a special tag for setting the cookie associated with the response generated. The Set-Cookie tag occurs multiple times corresponding to the number of cookies. Alternatively, HttpServletResponse provides a special method, addCookie(), for adding cookies to the response. We'll look at cookies in more detail when we handle session management.

Output Streams and Writer

The servlet sends the output using a PrintWriter() or a ServletOutputStream(), depending on the type of data sent back. Both these objects are obtainable from the HttpServletResponse object passed as an argument to the service() method and doXXX() methods. The output objects are obtained using one of the following mechanisms:

PrintWriter out = res.getWriter();

ServletOutputStream out = res.getOutputStream();

The PrintWriter is used when the servlet sends String data (such as a plain HTML page) back to the client, and the ServletOutputStream is used when sending back byte or ASCII data or a multipart data that includes both forms of data.

TIP

Streams are faster for transferring ASCII back to the client. For example, writing the content of an ASCII file to a PrintWriter is slow because it transforms ASCII to String to ASCII again. Instead, get a stream class to read the file and send it to the ServletOutputStream.

Unlike the inputs, both PrintWriter and ServletOutputStream can be used in the same servlet and in the dispatched resources via another servlet or a JSP without generating an IllegalStateException. All these outputs are written to the same buffer. The contents are written to the client using the print method as demonstrated in the SimpleServlet:

out.println("<html><head><title>SimpleServlet</title></head>");
out.println("<body>");
out.println("<h1>");
out.println("The input string upper case(d):" + convertedString );
out.println("</h1></body></html>");

After the response is sent, the output streams can be closed if the application wants to close the connection with the client. But WebLogic Server provides the capability to optimize the request-response between a client and server with persistent HTTP connections. To leverage this feature, the servlet should not flush the output or close the stream, and the WebLogic Server should know the length of the response sent to the user. If the servlet flushes the output in the buffer, WebLogic Server cannot determine the length of the response and it cannot use reuse the connection. The content length is automatically determined by the server and added to the header. If all these criteria are met, the connection between the client and the server becomes automatically durable and can then be reused over multiple conversations between the client and the WebLogic Server. There is significant overhead in making connections from the client to the server and the durable connection significantly improves the performance.

The connection does not live forever if there are no requests flowing through it, and the life of the connection can be controlled by the Duration property in the WebLogic configuration file. The default value is set to 30 seconds and the maximum is 60 seconds. Figure 14.5 displays the parameters that can be changed related to the durable connection behavior of WebLogic Server.

Figure 14.5. HTTP connection configuration.

graphics/14fig05.gif

[ Team LiB ]