Transactions Overview
This section lays the foundation for understanding transactions in a distributed environment. Knowledge of transactional properties and how transactions are managed is essential to understanding the JTA and creating your own transaction-aware applications.
So, before we get into the definition of what makes up a transaction, let's first define the concept of application state. An application's state is made up of all of the data the application “knows.” Application state may be stored in memory, in files (such as a text file or an XML file), or in a database. In the event of a system failure, such as a power outage, hardware failure, or system crash, we want to ensure that all the data can be restored exactly as it was before the outage, thereby restoring its state.
Now that we have a definition of state, we can define a transaction to be a related set of operations on an application's state. Transactions help bring order to the enterprise by combining multiple operations into a single, atomic action. Either all the individual operations complete or none of them complete. Transactions also lay out some ground rules, or protocols, by which all concurrent transactions abide to communicate with the data stores and ensure that no one steps on another's toes.
ACID Transactions
There are four basic properties that govern transactions, commonly referred to by the acronym ACID:
Atomicity—
A transaction is atomic. All the individual actions that make up a transaction are encapsulated into a single unit of work. Either all the actions occur or none of them occur. The granularity of these operations is up to the developer.
Consistency—
Any alteration that a transaction makes to the data always abides by and fulfills the rules of the data store. When it ends, it leaves the data in the same consistent state. If the transaction fails, the data is in the same state it was when the transaction began.
Isolation—
The transaction operates independent of and isolated from any other concurrent transactions on the data store. Only after the transaction has completed can its changes be viewed by other applications and transactions.
Durability—
The transaction's alterations persist, or are durable. After the transaction has completed, the updated (or rolled back) data is written securely to the data store. In the event of any future system failure, the data can be restored.
Consider the example of an ATM bank machine user who wants to transfer $500 from her checking account into her savings account. Atomicity assures us that either all the actions to complete her transfer commit successfully or none of them does. There are many individual actions that encapsulate the transfer transaction. You can identify the following three high-level actions:
Her checking account is verified to hold at least $500. The $500 is subtracted from her checking account. The $500 is added to her savings account.
Should there be a failure in any one of these individual actions to commit (for example, insufficient funds found, power failure, savings account invalid, and so on), they're all rolled back to their original pre-transaction states.
Consistency ensures that all the data is updated appropriately and resources are released back to the system. Our ATM user can be sure that each time she transfers funds from one account to the other, the numerical amounts are updated in the correct format required by the database. Let's assume that the transaction fails after step 2 and before step 3 can complete. In this case, the $500 must be put back into the user's checking account.
Isolation ensures that while the transaction is in process, no unexpected changes can occur in the data store. Suppose that the user's spouse attempts to make a withdrawal from the same checking account at the same time, but at a different ATM location. Isolation ensures that the first transaction (the transfer) is insulated from the second transaction (the withdrawal). The withdrawal transaction cannot access the checking account data until the transfer transaction completes. As we'll see, there are a few varying degrees of isolation that may be configured programmatically.
Durability ensures that if the user has transferred $500, she'll see that change the next time she checks her accounts, regardless of any power outages or database failures that might occur before that time.
Local Versus Global Transactions
A transaction is often classified as being either local or global, depending on its management and the number of resources it may alter.
Each participant in a transaction is called a resource. A resource can be a database system, persistent messaging store, or any other transaction-enabled entity. Resource managers are in charge of managing these resources. For a resource to support transactions, it must have a transaction-aware resource manager associated with it. A resource manager can be a driver, such as a transaction-aware JDBC driver, or it can be integrated into the resource itself, as is the case with WebLogic JMS resources.
Local Transactions
A local transaction involves a single resource and is restricted to a single process (all updates to the resource are committed at the end of that process). In Java, local transactions are normally managed by the objects used to access a resource. For example, a JDBC database's transactions are handled by objects that implement the java.sql.Connection interface. And a JMS server's transactions are handled by objects that implement the javax.jms.Session interface.
Global Transactions
Sometimes referred to as a distributed transaction, a global transaction differs from a local transaction in that it may update across two or more resource managers and JVMs. It also employs a transaction manager to manage those changes. Global transactions are usually managed by implementing the JTA javax.transaction.UserTransaction interface or through the transaction manager on the container.
WebLogic's global transactional support is based on The Open Group's X/Open Distributed Transactional Processing (DTP) model. DTP is the most widely adopted model for building global transactional applications. This is the same concept as XA-distributed transactions. It defines the transactional participants and the interfaces used to communicate between them. Almost all vendors developing transactional-related products, such as RDBMs, message queuing, and component containers, support the interfaces defined in the DTP model.
Global Transaction Participants
Global transactions may involve the following participants:
Application server—
The application server hosts the application by providing the infrastructure required to support the server (for example, BEA WebLogic Server).
Application program—
The application program is a component-based transactional application that has its transaction boundaries controlled through either the JTA interface or the transaction manager (examples include an EJB component, a JMS client, and a standalone Java client).
Transaction manager—
Sometimes referred to as a transaction process monitor (TPM), the transaction manager manages transactions for the application program. It communicates with all resource managers participating in a transaction and acts as a liaison between them and the application program. Based on those communications, the transaction manager can then tell the resource managers whether to commit or roll back. It then communicates the outcome back to the application program (examples are BEA WebLogic Server, BEA Tuxedo, and CICS).
The Two-Phase Commit Protocol
A transaction often manipulates data among multiple data stores. In a distributed environment there may be, and often are, many transactions occurring simultaneously, all trying to access and alter the same data on those data stores. How can these resources be coordinated to ensure data integrity? The answer comes in the form of the two-phase commit protocol. The two-phase commit protocol puts a lock on all the resources participating in a single transaction and coordinates all the resource managers to commit simultaneously. This all happens in an instant. There are two distinct phases in the global transaction process: a prepare phase and a commit phase.
Prepare phase—
In the first phase, the resource manager attempts to record the original and the updated information (usually to a transaction log file). If successful, the resource then indicates to its resource manager that it's ready to make the changes.
This indication is a pledge that the operation will happen. Resources can then send their vote to the transaction manager whether to commit or roll back the transaction.
Commit phase—
In the second phase, based on the votes sent in from all the participating resource managers, the transaction manager decides whether to commit the transaction. If all resources have voted to commit, all the resources are updated. If one or more of the resources votes to abort, all the resources are returned to their previous state.
How All the Participants Work Together to Manage a Transaction
Assume that we have a transactional application hosted on the application server and application data stored in a relational database. All the resource managers declare themselves by contacting the local transaction manager (known as enlisting), and then wait for an execution request from the application program. A typical request might be to insert, update, and delete records in the database.
The application program sends a commit request that updates records in two databases. The transaction manager initiates the two-phase commit protocol to communicate with each resource manager. First, the transaction manager queries each resource manager and asks whether it's prepared to commit the transaction. Based on the votes of all the participating resource managers, the transaction manager makes the decision whether to commit or roll back the transaction. It is the job of each resource manager to retain its data (both the original and the changed) pending the outcome of the global transaction. When the resource manager is holding this information, it is in the prepare phase. While in the prepare phase, the resource manager locks the data modified by the transaction to isolate these changes from any other transactions. It remains in this locked state (and thus locks the databases) until it receives a commit (or roll back) message from the transaction manager.
After receiving all the resource managers' votes, the transaction manager decides whether to commit the transaction. If all the resource managers vote to commit, the transaction goes to the commit phase, and each resource manager receives the message to update its resource. If there is at least one vote to abort, the transaction manager sends the message to abort all operations.
Finally, the application program is notified via the transaction manager whether the commit was successful.
The Java Transaction Service
When reading about transactions you are bound to come across the term Java Transaction Service (or JTS). What exactly is JTS?
Simply stated, JTS is an implementation of the transaction processing monitor paradigm. Shortly after The Open Group released its DTP model, the Object Management Group (OMG) released its own Object Transaction Service (OTS). Although based on the X/Open model, OTS replaces the DTP interfaces with CORBA IDL-aware interfaces. In this model, objects can communicate via CORBA method calls over IIOP. It was at about this time that specifications were being drawn up for the J2EE implementation of transactional services. Not surprisingly, the design of J2EE's transaction support is heavily influenced by OTS. In fact, JTS implements OTS and acts as an interface between JTA and OTS.
Recall that we defined a transaction manager as a kind of transaction process monitor (or TPM) that coordinates the execution of transactions on behalf of the application program. Transaction process monitors first appeared en masse with IBM's CICS in the 1960s. TPM-managed transactions were written to perform as procedural events on the database. The 1990s brought distributed object protocols such as CORBA and RMI to the scene. A new component-oriented implementation of the TPM was needed to manage those objects' transactions. The Java Transaction Service is that component-based implementation of the TPM: a component transaction-process monitor (or CTM). It is this component-oriented JTS model that WebLogic implements for its transaction management.
|