Previous Page
Next Page

Using Aggregate Functions

It is often necessary to summarize data without actually retrieving it all, and MySQL provides special functions for this purpose. Using these functions, MySQL queries are often used to retrieve data for analysis and reporting purposes. Examples of this type of retrieval are

  • Determining the number of rows in a table (or the number of rows that meet some condition or contain a specific value)

  • Obtaining the sum of a group of rows in a table

  • Finding the highest, lowest, and average values in a table column (either for all rows or for specific rows)

In each of these examples, you want a summary of the data in a table, not the actual data itself. Therefore, returning the actual table data would be a waste of time and processing resources (not to mention bandwidth). To repeat, all you really want is the summary information.

To facilitate this type of retrieval, MySQL features a set of aggregate functions, some of which are listed in Table 12.1. These functions enable you to perform all the types of retrieval just enumerated.

Table 12.1. SQL Aggregate Functions

Function

Description

AVG()

Returns a column's average value

COUNT()

Returns the number of rows in a column

MAX()

Returns a column's highest value

MIN()

Returns a column's lowest value

SUM()

Returns the sum of a column's values


New Term

Aggregate Functions Functions that operate on a set of rows to calculate and return a single value.


The use of each of these functions is explained in the following sections.

Note

Standard Deviation A series of standard deviation aggregate functions are also supported by MySQL, but they are not covered in this book.


The AVG() Function

AVG() is used to return the average value of a specific column by counting both the number of rows in the table and the sum of their values. AVG() can be used to return the average value of all columns or of specific columns or rows.

This first example uses AVG() to return the average price of all the products in the products table:

Input

SELECT AVG(prod_price) AS avg_price
FROM products;

Output

+-----------+
| avg_price |
+-----------+
| 16.133571 |
+-----------+

Analysis

The previous SELECT statement returns a single value, avg_price, that contains the average price of all products in the products table. avg_price is an alias as explained in Chapter 10, "Creating Calculated Fields."

AVG() can also be used to determine the average value of specific columns or rows. The following example returns the average price of products offered by a specific vendor:

Input

SELECT AVG(prod_price) AS avg_price
FROM products
WHERE vend_id = 1003;

Output

+-----------+
| avg_price |
+-----------+
| 13.212857 |
+-----------+

Analysis

This SELECT statement differs from the previous one only in that this one contains a WHERE clause. The WHERE clause filters only products with a vend_id of 1003, and, therefore, the value returned in avg_price is the average of just that vendor's products.

Caution

Individual Columns Only AVG() may only be used to determine the average of a specific numeric column, and that column name must be specified as the function parameter. To obtain the average value of multiple columns, multiple AVG() functions must be used.


Note

NULL Values Column rows containing NULL values are ignored by the AVG() function.


The COUNT() Function

COUNT() does just that: It counts. Using COUNT(), you can determine the number of rows in a table or the number of rows that match a specific criterion.

COUNT() can be used two ways:

  • Use COUNT(*) to count the number of rows in a table, whether columns contain values or NULL values.

  • Use COUNT(column) to count the number of rows that have values in a specific column, ignoring NULL values.

This first example returns the total number of customers in the customers table:

Input

SELECT COUNT(*) AS num_cust
FROM customers;

Output

+----------+
| num_cust |
+----------+
|        5 |
+----------+

Analysis

In this example, COUNT(*) is used to count all rows, regardless of values. The count is returned in num_cust.

The following example counts just the customers with an email address:

Input

SELECT COUNT(cust_email) AS num_cust
FROM customers;

Output

+----------+
| num_cust |
+----------+
|        3 |
+----------+

Analysis

This SELECT statement uses COUNT(cust_email) to count only rows with a value in the cust_email column. In this example, cust_email is 3 (meaning that only three of the five customers have email addresses).

Note

NULL Values Column rows with NULL values in them are ignored by the COUNT() function if a column name is specified, but not if the asterisk (*) is used.


The MAX() Function

MAX() returns the highest value in a specified column. MAX() requires that the column name be specified, as seen here:

Input

SELECT MAX(prod_price) AS max_price
FROM products;

Output

+-----------+
| max_price |
+-----------+
|     55.00 |
+-----------+

Analysis

Here MAX() returns the price of the most expensive item in products table.

Tip

Using MAX() with Non-Numeric Data Although MAX() is usually used to find the highest numeric or date values, MySQL allows it to be used to return the highest value in any column including textual columns. When used with textual data, MAX() returns the row that would be the last if the data were sorted by that column.


Note

NULL Values Column rows with NULL values in them are ignored by the MAX() function.


The MIN() Function

MIN() does the exact opposite of MAX(); it returns the lowest value in a specified column. Like MAX(), MIN() requires that the column name be specified, as seen here:

Input

SELECT MIN(prod_price) AS min_price
FROM products;

Output

+-----------+
| min_price |
+-----------+
| 2.50      |
+-----------+

Analysis

Here MIN() returns the price of the least expensive item in products table.

Tip

Using MIN() with Non-Numeric Data As with the MAX() function, MySQL allows MIN() to be used to return the lowest value in any columns including textual columns. When used with textual data, MIN() returns the row that would be first if the data were sorted by that column.


Note

NULL Values Column rows with NULL values in them are ignored by the MIN() function.


The SUM() Function

SUM() is used to return the sum (total) of the values in a specific column.

Here is an example to demonstrate this. The orderitems table contains the actual items in an order, and each item has an associated quantity. The total number of items ordered (the sum of all the quantity values) can be retrieved as follows:

Input

SELECT SUM(quantity) AS items_ordered
FROM orderitems
WHERE order_num = 20005;

Output

+---------------+
| items_ordered |
+---------------+
| 19            |
+---------------+

Analysis

The function SUM(quantity) returns the sum of all the item quantities in an order, and the WHERE clause ensures that just the right order items are included.

SUM() can also be used to total calculated values. In this next example the total order amount is retrieved by totaling item_price*quantity for each item:

Input

SELECT SUM(item_price*quantity) AS total_price
FROM orderitems
WHERE order_num = 20005;

Output

+-------------+
| total_price |
+-------------+
|      149.87 |
+-------------+

Analysis

The function SUM(item_price*quantity) returns the sum of all the expanded prices in an order, and again the WHERE clause ensures that just the correct order items are included.

Tip

Performing Calculations on Multiple Columns All the aggregate functions can be used to perform calculations on multiple columns using the standard mathematical operators, as shown in the example.


Note

NULL Values Column rows with NULL values in them are ignored by the SUM() function.



Previous Page
Next Page