Previous Section  < Day Day Up >  Next Section

5.7 Concatenating SELECT Results with UNION

The UNION keyword enables you to concatenate the results from two or more SELECT statements. The syntax for using it is as follows:






SELECT ... UNION SELECT ... UNION SELECT ...


The result of such a statement consists of the rows retrieved by the first SELECT, followed by the rows retrieved by the second SELECT, and so on. Each SELECT must produce the same number of columns.

By default, UNION eliminates duplicate rows from the result set. If you want to retain all rows, replace the first instance of UNION with UNION ALL. (UNION ALL also is more efficient for the server to process because it need not perform duplicate removal.)

UNION is useful under the following circumstances:

  • You have similar information in multiple tables and you want to retrieve rows from all of them at once.

  • You want to select several sets of rows from the same table, but the conditions that characterize each set aren't easy to write as a single WHERE clause. UNION allows retrieval of each set with a simpler WHERE clause in its own SELECT statement; the rows retrieved by each are combined and produced as the final query result.

Suppose that you run three mailing lists, each of which is managed using a different MySQL-based software package. Each package uses its own table to store names and email addresses, but they have slightly different conventions about how the tables are set up. The tables used by the list manager packages look like this:






CREATE TABLE list1

(

    subscriber  CHAR(60),

    email       CHAR(60)

);



CREATE TABLE list2

(

    name        CHAR(255),

    address     CHAR(255)

);



CREATE TABLE list3

(

    email       CHAR(50),

    real_name   CHAR(30)

);


Note that each table contains similar types of information (names and email addresses), but they don't use the same column names or types, and they don't store the columns in the same order. How do you write a query that produces the combined subscriber list? Use UNION. It doesn't matter that the tables don't have exactly the same structure. To select their combined contents, name the columns from each table in the order you want to see them. A query to retrieve names and addresses from the tables looks like this:






SELECT subscriber, email FROM list1

UNION SELECT name, address FROM list2

UNION SELECT real_name, email FROM list3;


The first column of the result contains names and the second column contains email addresses. The names of the columns resulting from a UNION are taken from the names of the columns in the first SELECT statement. This means the result set column names are subscriber and email. If you provide aliases for columns in the first SELECT, the aliases are used as the output column names.

The types of the output columns also are taken from the values retrieved by the first SELECT. For the query shown, the types will be CHAR(60). (Note that because the list2 table contains CHAR(255) columns, it's possible for values retrieved from that table to be truncated in the UNION result. This limitation on UNION is removed in MySQL 4.1; column types are determined based on all retrieved values. In MySQL 4.0, you can avoid the problem by selecting from the table with the widest columns first.)

ORDER BY and LIMIT clauses can be placed at the end of a UNION to sort or limit the result set as a whole. Columns named in such an ORDER BY should refer to columns in the first SELECT of the statement. (This is a consequence of the fact that the first SELECT determines the result set column names.) The following statement sorts the result of the UNION by email address and returns the first 10 rows of the combined result:






SELECT subscriber, email FROM list1

UNION SELECT name, address FROM list2

UNION SELECT real_name, email FROM list3

ORDER BY email LIMIT 10;


ORDER BY and LIMIT clauses also can be applied to individual SELECT statements within a UNION: Add the clauses to the appropriate SELECT and surround the resulting SELECT with parentheses. In this case, an ORDER BY should refer to columns of the particular SELECT with which it's associated. The following query sorts the result of each SELECT by email address and returns the first five rows from each one:






(SELECT subscriber, email FROM list1 ORDER BY email LIMIT 5)

UNION (SELECT name, address FROM list2 ORDER BY address LIMIT 5)

UNION (SELECT real_name, email FROM list3 ORDER BY email LIMIT 5);


    Previous Section  < Day Day Up >  Next Section