The original version of this article was first published on IBM
developerWorks, and is property of Westtech Information Services. This
document is an updated version of the original article, and contains
various improvements made by the Gentoo Linux Documentation team.
This document is not actively maintained.
Software RAID in the new Linux 2.4 kernel, Part 1
Installation and a general introduction
The wonders of RAID
The 2.4 kernel has a number of nifty features and additions. One of
these is the inclusion of a modern Software RAID implementation -- yay!
Software RAID allows you to dramatically increase Linux disk IO
performance and reliability without buying expensive hardware RAID
controllers or enclosures. Because it's implemented in software, Linux
RAID software is flexible, fast... and fun!
The concept behind Software RAID is simple -- it allows you to combine
two or more block devices (usually disk partitions) into a single RAID
device. So let's say you have three empty partitions:
hda3, hdb3, and hdc3. Using
Software RAID, you can combine these partitions and address them as a
single RAID device, /dev/md0. md0 can then
be formatted to contain a filesystem and used like any other
partition. There are also a number of different ways to configure a
RAID volume -- some maximize performance, others maximize
availability, while others provide a mixture of both.
There are two forms of RAID: linear and RAID-0 mode. Neither one is
technically a form of RAID at all, since RAID stands for "redundant
array of inexpensive disks", and RAID-0 and linear mode don't
provide any kind of data redundancy. However, both modes -- especially
RAID-0 -- are very useful. After giving you a quick overview of these
two forms of "AID", I'll step you through the process of getting
Software RAID set up on your system.
Introduction to linear mode
Linear mode is one of the simplest methods of combining two or more
block devices into a RAID volume -- the method of simple
concatenation. If you have three partitions, hda3,
hdb3, and hdc3, and each is about 2GB, they
will create a resultant linear volume of 6GB. The first third of the
linear volume will reside on hda3, the last third on
hdc3, and the middle third on hdb3.
To configure a linear volume, you'll need at least two partitions
that you'd like to join together. They can be different sizes, and
they can even all reside on the same physical disk without
negatively affecting performance.
Linear mode is the best way to combine two or more partitions on the
same disk into a single volume. While doing this with any other RAID
technique will result in a dramatic loss of performance, linear mode
is saved from this problem because it doesn't write to its
constituent partitions in parallel (as all the other RAID modes do).
But for the same reason, linear mode has the liability of lacking
scale in performance compared to RAID-0, RAID-4, RAID-5, and to some
In general, linear mode doesn't provide any kind of performance
improvement over traditional non-RAID partitions. Actually, if you
spread your linear volume over multiple disks, your volume is more
likely to become unavailable due to a random hard drive failure. The
probability of failure of a linear volume will be equal to the sum of
the probabilities of failure of its constituent physical disks and
controllers. If one physical disk dies, the linear volume is
generally unrecoverable. Linear mode does not offer any additional
redundancy over using a single disk.
But linear mode is a great way to avoid repartitioning a single disk.
For example, say your second IDE drive has two unused partitions,
hdb1 and hdb3. And say you're unable to
repartition the drive due to critical data hanging out at
hdb2. You can still combine hdb1 and
hdb3 into a single, cohesive whole using linear mode.
Linear mode is also a good way to combine partitions of different
sizes on different disks when you just need a single big partition
(and don't really need to increase performance). But for any other
job there are better RAID technologies you can use.
Introduction to RAID-0 mode
RAID-0 is another one of those "RAID" modes that doesn't have any
"R" (redundancy) at all. Nevertheless, RAID-0 is immensely useful.
This is primarily because it offers the highest performance
potential of any form of RAID.
To set up a RAID-0 volume you'll need two or more equally (or
almost equally) sized partitions. The RAID-0 code will evenly
distribute writes (and thus reads) between all constituent
partitions. And by parallelizing reads and writes between all
constituent devices, RAID-0 has the benefit of multiplying IO
performance. Ignoring the complexities of controller and bus
bandwidth, you can expect a RAID-0 volume composed of two
partitions on two separate identical disks to offer nearly double
the performance of a traditional partition. Crank your RAID-0
volume up to three disks, and performance will nearly triple.
This is why a RAID-0 array of IDE disks can outperform the fastest
SCSI or FC-AL drive on the market. For truly blistering
performance, you can set up a bunch of SCSI or FC-AL drives in a
RAID-0 array. That's the beauty of RAID-0.
To create a RAID-0 volume, you'll need two or more equally sized
partitions located on separate disks. The capacity of the volume
will be equal to the combined capacity of the constituent
partitions. As with linear mode, you can combine block devices
from various sources (such as IDE and SCSI drives) into a single
volume with no problems.
If you're creating a RAID-0 volume using IDE disks, you should try
to use UltraDMA compliant disks and controllers for maximum
reliability. And you should use only one drive per IDE channel to
avoid sluggish performance -- a slave device, especially if it's
also part of the RAID-0 array, will slow things down so much as to
nearly eliminate any RAID-0 performance benefit. You may also need
to add an off-board IDE controller so that you have the extra IDE
channels you require.
If you're creating a RAID-0 volume out of SCSI devices, be aware
that the combined throughput of all the drives can potentially
exceed the maximum throughput of the SCSI (and potentially PCI) bus.
In such a case, the SCSI bus will be the performance-limiting
factor. If, for example, you have four drives that have a maximum
throughput of 15Mb/sec set up on a 40Mb/sec 68-pin Ultra Wide bus,
there will be times when the drives will saturate the bus, and
performance will reach an upper maximum of close to 40Mb/sec. This
may be fine for your application (after all, 40Mb/sec IO ain't bad!),
but you'd probably have identical peak IO performance from a RAID-0
volume that used only three drives.
From a reliability standpoint, RAID-0 has the same characteristics
as linear mode -- the more drives you add to the array, the higher
the probability of volume failure. And, like linear mode, the
death of a single drive will bring down the entire RAID-0 volume
and make it unrecoverable. To figure out the probability of
failure of your RAID-0 volume, simply add together the
probabilities of failure of all constituent drives.
RAID-0 is ideal for applications for which you need maximum IO
performance, since it's the highest-performing RAID mode available.
But remember that RAID-0 should only be used if you can tolerate a
slightly higher risk of volume failure.
If you're putting together a compute farm or web cluster, RAID-0 is
an excellent way to increase disk IO performance. Since in this
case you would already have an existing level of redundancy (lots
of spare machines), your resources would continue to be available
for the rare case that a machine with a failed hard drive needs to
be brought down for a drive replacement and reload.
Setting up Linux 2.4 Software RAID
There are two steps involved in getting your 2.4 system ready for
Software RAID. First, RAID support needs to be enabled in the kernel.
This normally involves recompiling and installing a new kernel
unless you're already using a 2.4 series kernel with RAID support
Then the raidtools package needs to be compiled and installed. The
raidtools are the user-level tools that allow you to initialize,
start, stop, and control your RAID volumes. Once these two steps are
complete, you'll be able to create your own RAID volumes, create
filesystems on the volumes, mount them, etc.
I'm using kernel 2.4.0-test10 for this series. I recommend that you
use the most recent 2.4 kernel you can track down, which should at
least be kernel 2.4.0-test10 or later (but not 2.4.0-test11, which
had serious filesystem corruption problems). You can find a recent
kernel over at kernel.org,
and a tutorial showing you how to recompile and install a new kernel
from sources elsewhere on gentoo.org (see the Resources section later in this article).
Configuring the kernel
I recommend that you configure your kernel so that Software RAID
support is compiled-in, rather than supported as modules. When you
type make menuconfig or make xconfig, you'll find the
Software RAID settings under the "Multi-device support (RAID and
LVM)" section. I also recommend that you enable everything
RAID-related here, including "Boot support" and "Auto Detect
support". This will allow the kernel to auto-start your RAID volume
at boot-time, as well as allow you to create a root RAID filesystem
if you so desire. Here's a snapshot of make menuconfig. The
last two options (LVM support) are not required, although I
compiled them into the kernel anyway:
Figure 4.1: Configuring the kernel for RAID
Once the kernel is properly configured, install it and reboot. Now
let's track down the latest version of raidtools.
Before we can install raidtools we need to do a bit of searching to
find the latest version. You can generally find the raidtools
program at kernel.org. Now
track down the most recent "raidtools-0.90" archive (not
"raid0145"!). Currently it's "raidtools-19990824-0.90.tar.gz".
If you like living on the bleeding edge (and if you're using a
2.4.0-test kernel, then you do), you may want to head over to
RedHat (see Resources) and snag the
latest version of raidtools you can find. Currently it's
Code Listing 4.1: Installing raidtools
# cd raidtools-0.90
# make install
Code Listing 4.2: Examining RAID
# cat /proc/mdstat
OK, now it's time to prepare some disk partitions, of which you'll
need at least two. If you're using RAID-0, make sure they're on
separate disks and approximately the same size. It goes without
saying that the data on these partitions will be destroyed.
One other important note -- when you create your partitions, give
them the partition type FD. This will allow the Linux
kernel to recognize them as Linux RAID partitions, so they will be
autodetected and started at every boot. If you don't mark your
RAID partitions this way, you'll need to type
raidstart --all after every boot before you can mount your
RAID volumes. That can be annoying, so set the partition type
The raidtab syntax is fairly easy to figure out: each block of
directives begins with a raiddev entry specifying the RAID
volume that will be created. When you installed raidtools, the
Makefile created /dev/md0 through md15
for you, so they're available for use.
Next, nr-raid-disks should specify the number of disks in
your array. Then you set the persistent-superblock to
1, telling the raid tools that when this volume is
created, a special superblock should be written to each
constituent device describing the configuration of the RAID
array. The Linux kernel uses this information to auto-detect and
start up RAID arrays at boot time, so you should make sure that
every RAID volume you create is configured to do this.
chunk-size specifies the granularity of the chunks used
for RAID-0 in kilobytes. In this example, our RAID-0 volume will
write to its constituent partitions in 32K blocks; that is, the
first 32K of the RAID volume maps to hde1, the
second 32K maps to hdg1, etc. We also specify a
chunk size for our /dev/md1 linear volume -- this is
just a dummy entry and doesn't mean anything.
Finally, you specify the devices that make up the volume. First
you specify the actual block device with a device line,
and then you immediately follow it with a raid-disk entry
that specifies its position in the array, starting with zero.
Once you've created your own /etc/raidtab file,
you're ready to do a one-time initialization of the array.
mkraid and creating the filesystem
OK. Our partitions are created, the raidtab file is in place --
now it's time to initialize our first partition by using the
Code Listing 4.3: Initializing the partition
# mkraid /dev/md0
After this command completes, /dev/md0 will be
initialized and the md0 array will be started. If you type
cat /proc/mdstat, you should see something like this:
Code Listing 4.4: cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid5]
read_ahead 1024 sectors
md0 : active raid0 hdg1 hde1
90069632 blocks 32k chunks
unused devices: <none>
Yay! Our RAID device is up and running. All we need to do now
is create a filesystem on it. To do this, use the
mke2fs command or the mkreiserfs command
(RAID-0 and ReiserFS is a great combination!):
Code Listing 4.5: An ext2 RAID device
# mke2fs /dev/md0
Code Listing 4.6: A ReiserFS RAID device
# mkreiserfs /dev/md0
Now your new filesystem can be mounted:
Code Listing 4.7: Mounting the new RAID device
# mkdir /mnt/raid
# mount /dev/md0 /mnt/raid
Feel free to add a /dev/md0 entry to your fstab.
It goes something like this:
Code Listing 4.8: Editing fstab
/dev/md0 /mnt/raid reiserfs defaults 0 0
If you set the partition type correctly to FD, your
RAID volume will be auto-started at boot time. Now all that's
left to do is use and enjoy your new Software RAID volume.
And catch my second Software RAID article, in which we'll
take a look at some more advanced Software RAID functionality