Bottom Halves

The job of bottom halves is to perform any interrupt-related work not performed by the interrupt handler itself. In an ideal world, this is nearly all the work because you want the interrupt handler to perform as little work (and in turn be as fast) as possible. The goal is simply to have interrupt handlers return as quickly as possible.

Nonetheless, the interrupt handler must perform some of the work. For example, the interrupt handler almost assuredly needs to acknowledge the receipt of the interrupt with the hardware. It might also need to copy data to or from the hardware. This work is timing sensitive, so it makes sense to perform it in the interrupt handler itself.

Almost anything else is fair game for performing in the bottom half. For example, if you copy data from the hardware into memory in the top half, it certainly makes sense to process it in the bottom half. Unfortunately, no hard and fast rules exist about what work to perform wherethe decision is left entirely up to the device driver author. Although no arrangement is wrong, an arrangement can easily be suboptimal. Remember, interrupt handlers run asynchronously, with at least the current interrupt line disabled. Minimizing their duration is important. No strict rules about how to divide the work between the top and bottom half exist, but a couple of useful tips might help:

If the work is time sensitive, perform it in the interrupt handler.
If the work is related to the hardware itself, perform it in the interrupt handler.
If the work needs to ensure that another interrupt (particularly the same interrupt) does not interrupt it, perform it in the interrupt handler.
For everything else, consider performing the work in the bottom half.

When attempting to write your own device driver, looking at other interrupt handlers and their corresponding bottom halves will help. When deciding how to divide your interrupt processing work between the top and bottom half, ask yourself what has to be in the top half and what can be in the bottom half. Generally, the quicker the interrupt handler executes, the better.

Why Bottom Halves?

It is crucial to understand why to defer work, and when exactly to defer it. You want to limit the amount of work you perform in an interrupt handler because interrupt handlers run with the current interrupt line disabled on all processors. Worse, handlers that register with SA_INTERRUPT run with all local interrupts disabled (plus the local interrupt line globally disabled). Minimizing the time spent with interrupts disabled is important to system response and performance. Add to this the fact that interrupt handlers run asynchronously with respect to other codeeven other interrupt handlersand then it is clear that you should work to minimize how long interrupt handlers run. The solution is to defer some of the work until later.

But when is "later"? The important thing to realize is that later is often simply not now. The point of a bottom half is not to do work at some specific point in the future, but simply to defer work until any point in the future when the system is less busy and interrupts are again enabled. Often, bottom halves run immediately after the interrupt returns. The key is that they run with all interrupts enabled.

Not just Linux, but many operating systems separate the processing of hardware interrupts into two parts. The top half is quick and simple and runs with some or all interrupts disabled. The bottom half (however it is implemented) runs later with all interrupts enabled. This design keeps system response time low by running with interrupts disabled for as little time as necessary.

A World of Bottom Halves

Unlike the top half, which is implemented entirely via the interrupt handler, multiple mechanisms are available for implementing a bottom half. These mechanisms are different interfaces and subsystems that enable you to implement bottom halves. Whereas the previous chapter looked at just a single way of implementing interrupt handlers, this chapter looks at multiple methods of implementing bottom halves. In fact, over the course of Linux's history there have been many bottom-half mechanisms. Confusingly, some of these mechanisms have similar or even really dumb names. It requires a special type of programmer to name bottom halves.

This chapter discusses both the design and implementation of the bottom-half mechanisms that exist in 2.6. We also discuss how to use them in the kernel code you write. The old, but long since removed, bottom-half mechanisms are historically significant, and so they are mentioned when relevant.

In the beginning, Linux provided only the "bottom half" for implementing bottom halves. This name was logical because at the time that was the only means available for deferring work. The infrastructure was also known as "BH," which is what we will call it to avoid confusion with the generic term "bottom half." The BH interface was very simple, like most things in those good old days. It provided a statically created list of 32 bottom halves. The top half could mark whether the bottom half would run by setting a bit in a 32-bit integer. Each BH was globally synchronized. No two could run at the same time, even on different processors. This was easy to use, yet inflexible; simple, yet a bottleneck.

Later on, the kernel developers introduced task queues both as a method of deferring work and as a replacement for the BH mechanism. The kernel defined a family of queues. Each queue contained a linked list of functions to call. The queued functions were run at certain times, depending on which queue they were in. Drivers could register their bottom halves in the appropriate queue. This worked fairly well, but it was still too inflexible to replace the BH interface entirely. It also was not lightweight enough for performance-critical subsystems, such as networking.

During the 2.3 development series, the kernel developers introduced softirqs and tasklets. With the exception of compatibility with existing drivers, softirqs and tasklets were capable of completely replacing the BH interface^[1]. Softirqs are a set of 32 statically defined bottom halves that can run simultaneously on any processoreven two of the same type can run concurrently. Tasklets, which have an awful and confusing name^[2], are flexible, dynamically created bottom halves that are built on top of softirqs. Two different tasklets can run concurrently on different processors, but two of the same type of tasklet cannot run simultaneously. Thus, tasklets are a good tradeoff between performance and ease of use. For most bottom-half processing, the tasklet is sufficient. Softirqs are useful when performance is critical, such as with networking. Using softirqs requires more care, however, because two of the same softirq can run at the same time. In addition, softirqs must be registered statically at compile-time. Conversely, code can dynamically register tasklets.

^[1] It is nontrivial to convert BHs to softirqs or tasklets because BHs are globally synchronized and, therefore, assume that no other BH is running during their execution. The conversion did eventually happen, however, in 2.5.

^[2] They have nothing to do with tasks. Think of a tasklet as a simple and easy-to-use softirq.

To further confound the issue, some people refer to all bottom halves as software interrupts or softirqs. In other words, they call both the softirq mechanism and bottom halves in general softirqs. Ignore those people. They run with the same crowd that named the BH and tasklet mechanisms.

While developing the 2.5 kernel, the BH interface was finally tossed to the curb as all BH users were converted to the other bottom-half interfaces. Additionally, the task queue interface was replaced by the work queue interface. Work queues are a very simple yet useful method of queueing work to later be performed in process context. We will get to them later.

Consequently, today 2.6 has three bottom-half mechanisms in the kernel: softirqs, tasklets, and work queues. The kernel used to have the BH and task queue interfaces, but today they are mere memories.

Kernel Timers

Another mechanism for deferring work is kernel timers. Unlike the mechanisms discussed in the chapter thus far, timers defer work for a specified amount of time. That is, although the tools discussed in this chapter are useful to defer work to any time but now, you use timers to defer work until at least a specific time has elapsed.

Therefore, timers have different uses than the general mechanisms discussed in this chapter. A full discussion of timers is given in Chapter 10, "Timers and Time Management."

Bottom Half Confusion

This is some seriously confusing stuff, but really it involves just naming issues. Let's go over it again.

"Bottom half" is a generic operating system term referring to the deferred portion of interrupt processing, so named because it represents the second, or bottom, half of the interrupt processing solution. In Linux, the term currently has this meaning, too. All the kernel's mechanisms for deferring work are "bottom halves." Some people also confusingly call all bottom halves "softirqs," but they are just being annoying.

"Bottom half" also refers to the original deferred work mechanism in Linux. This mechanism is also known as a "BH," so we call it by that name now and leave the former as a generic description. The BH mechanism was deprecated a while back and fully removed in 2.5.

Currently, there are three methods for deferring work: softirqs, tasklets, and work queues. Tasklets are built on softirqs and work queues are entirely different. Table 7.1 presents a history of bottom halves.

Table 7.1. Bottom Half Status
Bottom Half
Status
BH
Removed in 2.5
Task queues
Removed in 2.5
Softirq
Available since 2.3
Tasklet
Available since 2.3
Work queues
Available since 2.5

With this naming confusion settled, let's look at the individual mechanisms.