Solaris Volume Manager (SVM)

Analyze and explain SVM concepts (logical volumes, soft partitions, state databases, hot spares, and hot spare pools).

SVM, formerly called Solstice DiskSuite, comes bundled with the Solaris 10 operating system and uses virtual disks, called volumes, to manage physical disks and their associated data. A volume is functionally identical to a physical disk from the point of view of an application. You may also hear volumes referred to as virtual or pseudo devices.

A recent feature of SVM is soft partitions. This breaks the traditional eight-slices-per-disk barrier by allowing disks, or logical volumes, to be subdivided into many more partitions. One reason for doing this might be to create more manageable file systems, given the ever-increasing capacity of disks.

Note

SVM Terminology If you are familiar with Solstice DiskSuite, you'll remember that virtual disks were called metadevices. SVM uses a special driver, called the metadisk driver, to coordinate I/O to and from physical devices and volumes, enabling applications to treat a volume like a physical device. This type of driver is also called a logical, or pseudo driver.

In SVM, volumes are built from standard disk slices that have been created using the format utility. Using either the SVM command-line utilities or the graphical user interface of the Solaris Management Console (SMC), the system administrator creates each device by executing commands or dragging slices onto one of four types of SVM objects: volumes, disk sets, state database replicas, and hot spare pools. These elements are described in Table 10.2.

Table 10.2. SVM Elements
Object	Description
Volume	A volume, or metadevice, is a group of physical slices that appear to the system as a single, logical device. A volume is used to increase storage capacity and increase data availability. Solaris 10 SVM can support up to 8,192 logical volumes per disk set (see below), but the default is to support 128 logical volumes, namely `d0` thru `d127`. The various types of volumes are described in the next section of this chapter.
State database	A state database is a database that stores information about the state of the SVM configuration. Each state database is a collection of multiple, replicated database copies. Each copy is referred to as a state database replica. SVM cannot operate until you have created the state database and its replicas. You should create at least three state database replicas when using SVM because the validation process requires a majority (half + 1) of the state databases to be consistent with each other before the system will start up correctly. Each state database replica should ideally be physically located on a separate disk (and preferably a separate disk controller for added resilience).
Soft partition	A soft partition is a means of dividing a disk or volume into as many partitions as needed, overcoming the current limitation of eight. This is done by creating logical partitions within physical disk slices or logical volumes.
Disk set	A disk set is a set of disk drives containing state database replicas, volumes, and hot spares that can be shared exclusively, but not at the same time, by multiple hosts. If one host fails, another host can take over the failed host's disk set. This type of fail-over configuration is referred to as a clustered environment.
Hot spare	A hot spare is a slice that is reserved for use in case of a slice failure in another volume, such as a submirror or a RAID 5 metadevice. It is used to increase data availability.
Hot spare pool	A hot spare pool is a collection of hot spares. A hot spare pool can be used to provide a number of hot spares for specific volumes or metadevices. For example, a pool may be used to provide resilience for the rootdisk, while another pool provides resilience for data disks.

SVM Volumes

The types of SVM volumes you can create using Solaris Management Console or the SVM command-line utilities are concatenations, stripes, concatenated stripes, mirrors, and RAID 5 volumes. All of the SVM volumes are described in the following sections.

Note

No more Transactional Volumes As of Solaris 10, you should note that transactional volumes are no longer available with the Solaris Volume Manager (SVM). Use UFS logging to achieve the same functionality.

Concatenations

Concatenations work much the same way the Unix cat command is used to concatenate two or more files to create one larger file. If partitions are concatenated, the addressing of the component blocks is done on the components sequentially, which means that data is written to the first available slice until it is full, then moves to the next available slice. The file system can use the entire concatenation, even though it spreads across multiple disk drives. This type of volume provides no data redundancy, and the entire volume fails if a single slice fails. A concatenation can contain disk slices of different sizes because they are merely joined together.

Stripes

A stripe is similar to a concatenation, except that the addressing of the component blocks is interlaced on all of the slices comprising the stripe rather than sequentially. In other words, all disks are accessed at the same time in parallel. Striping is used to gain performance. When data is striped across disks, multiple controllers can access data simultaneously. An interlace refers to a grouped segment of blocks on a particular slice, the default value being 16K. Different interlace values can increase performance. For example, with a stripe containing five physical disks, if an I/O request is, say, 64K, then four chunks of data (16K each because of the interlace size) will be read simultaneously due to each sequential chunk residing on a separate slice.

The size of the interlace can be configured when the slice is created and cannot be modified afterward without destroying and recreating the stripe. In determining the size of the interlace, the specific application must be taken into account. If, for example, most of the I/O requests are for large amounts of data, say 10 Megabytes, then an interlace size of 2 Megabytes produces a significant performance increase when using a five disk stripe. You should note that, unlike a concatenation, the components making up a stripe must all be the same size.

Concatenated Stripes

A concatenated stripe is a stripe that has been expanded by concatenating additional striped slices.

Mirrors

A mirror is composed of one or more stripes or concatenations. The volumes that are mirrored are called submirrors. SVM makes duplicate copies of the data located on multiple physical disks, and presents one virtual disk to the application. All disk writes are duplicated; disk reads come from one of the underlying submirrors. A mirror replicates all writes to a single logical device (the mirror) and then to multiple devices (the submirrors) while distributing read operations. This provides redundancy of data in the event of a disk or hardware failure.

There are some mirror options that can be defined when the mirror is initially created, or following the setup. The options allow, for example, all reads to be distributed across the submirror components, improving the read performance. Table 10.3 describes the mirror read policies that can be configured.

Table 10.3. Mirror Read Policies
Read Policy	Description
Round Robin	This is the default policy and distributes the reads across submirrors.
Geometric	Reads are divided between the submirrors based on a logical disk block address.
First	This directs all reads to use the first submirror only.

Write performance can also be improved by configuring writes to all submirrors simultaneously. The trade-off with this option, however, is that all submirrors will be in an unknown state if a failure occurs. Table 10.4 describes the write policies that can be configured for mirror volumes.

Table 10.4. Mirror Write Policies
Write Policy	Description
Parallel	This is the default policy and directs the write operation to all submirrors simultaneously.
Serial	This policy specifies that writes to one submirror must complete before writes to the next submirror are started.

If a submirror goes offline, it must be resynchronized when the fault is resolved and it returns to service.

Exam Alert

Read and Write Policies Make sure you are familiar with the policies for both read and write as there have been exam questions that ask for the valid mirror policies.

RAID 5 Volumes

A RAID 5 volume stripes the data, as described in the "Stripes" section earlier, but in addition to striping, RAID 5 replicates data by using parity information. In the case of missing data, the data can be regenerated using available data and the parity information. A RAID 5 metadevice is composed of multiple slices. Some space is allocated to parity information and is distributed across all slices in the RAID 5 metadevice. The striped metadevice performance is better than the RAID 5 metadevice because the RAID 5 metadevice has a parity overhead, but merely striping doesn't provide data protection (redundancy).