I hope to turn this into a general easy to follow guide to setting up
RAID-5 and LVM on a modern Linux system. However, for now it’s basically
a collection of my notes as I experimented on my own systems.
Please note that my own experimentation was based on the RAID and LVM
implementations under Fedora Core 3 & 4, as wel as Red Hat Enterprise
Linux 4, all of which are based on the 2.6 series of kernels. These
instructions may or may not work with other versions or distros. I’m
not an expert (yet) in either Software RAID or LVM so please
use the comment section
below for corrections and comments.
Recent changes are highlighted in yellow.
- What is RAID and LVM
- Initial setup of a RAID-5 array
- Initial setup of LVM on top of RAID
- Handling a Drive Failure
- Common Glitches
- Other Useful Resources
- Expanding an Array/Filesytem
What is RAID and LVM
RAID is usually defined as Redundant Array of Inexpensive disks.
It is normally used to spread data among several physical hard drives with
enough redundancy that should any drive fail the data will still be
intact. Once created a RAID array appears to be one device which can
be used pretty much like a regular partition.
There are several kinds of RAID but I will only refer to
the two most common here.
The first is RAID-1 which is also known as mirroring.
With RAID-1 it’s basically done with two essentially
identical drives, each with a complete set of data. The second,
the one I will mostly refer to in this guide is RAID-5 which is
set up using three or more drives with the data spread in a way that any
one drive failing will not result in data loss. The Red Hat website
has a great overview of the
There is one limitation with Linux Software RAID that a
partition can only reside on a RAID-1 array.
Linux supports both several hardware RAID devices but also software
RAID which allows you to use any IDE or SCSI drives as the physical
devices. In all cases I’ll refer to software RAID.
LVM stands for Logical Volume Manager and
is a way of grouping drives and/or partition in a way where
instead of dealing with hard and fast physical partitions the data is
managed in a virtual basis where the virtual partitions can be resized.
The Red Hat website has a great overview of the
Logical Volume Manager.
There is one limitation that a LVM cannot be used for the
Initial set of a RAID-5 array
I recommend you experiment with setting up and
managing RAID and LVM systems before using it on an important
filesystem. One way I was able to do it was to take old hard drive
and create a bunch of partitions on it (8 or so should be enough)
and try combining them into RAID arrays. In my testing I created two
RAID-5 arrays each with 3 partitions. You can then manually fail
and hot remove the partitions from the array and then add them
back to see how the recovery process works. You’ll get a warning
about the partitions sharing a physical disc but you can ignore that
since it’s only for experimentation.
In my case I have two systems with RAID arrays, one with two
73G SCSI drives running RAID-1 (mirroring) and my other test
system is configured with three 120G IDE drives running RAID-5.
In most cases I will refer to my RAID-5 configuration as that
will be more typical.
I have an extra IDE controller in my system to allow me to
support the use of more than 4 IDE devices which caused a very
odd drive assignment. The order doesn’t seem to bother the
Linux kernel so it doesn’t bother me.
My basic configuration is as follows:
hda 120G drive
hdb 120G drive
hde 60G boot drive not on RAID array
hdf 120G drive
hdg CD-ROM drive
The first step is to create the physical partitions on each drive
that will be part of the RAID array. In my case I want to use each
120G drive in the array in it’s entirety. All the drives are
partitioned identically so for example, this is how
hda is partitioned:
Disk /dev/hda: 120.0 GB, 120034123776 bytes 16 heads, 63 sectors/track, 232581 cylinders Units = cylinders of 1008 * 512 = 516096 bytes Device Boot Start End Blocks Id System /dev/hda1 * 1 232581 117220792+ fd Linux raid autodetect
So now with all three drives with a partitioned with id
fd Linux raid autodetect you can go
ahead and combine the partitions into a RAID array:
# /sbin/mdadm --create --verbose /dev/md0 --level=5 --raid-devices=3 /dev/hdb1 /dev/hda1 /dev/hdf1
Wow, that was easy. That created a special device
/dev/md0 which can be used instead of a
physical partition. You can check on the status of that RAID array
with the mdadm command:
# /sbin/mdadm --detail /dev/md0 Version : 00.90.01 Creation Time : Wed May 11 20:00:18 2005 Raid Level : raid5 Array Size : 234436352 (223.58 GiB 240.06 GB) Device Size : 117218176 (111.79 GiB 120.03 GB) Raid Devices : 3 Total Devices : 3 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Fri Jun 10 04:13:11 2005 State : clean Active Devices : 3 Working Devices : 3 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 64K UUID : 36161bdd:a9018a79:60e0757a:e27bb7ca Events : 0.10670 Number Major Minor RaidDevice State 0 3 1 0 active sync /dev/hda1 1 3 65 1 active sync /dev/hdb1 2 33 65 2 active sync /dev/hdf1
The important lines to see are the State line which should
say clean otherwise there might be a problem.
At the bottom you should make sure that the State column always
says active sync which says each device is
actively in the array. You could potentially have a spare device
that’s on-hand should any drive should fail. If you have a spare
you’ll see it listed as such here.
One thing you’ll see above if you’re paying attention is the fact
that the size of the array is 240G but I have three 120G drives as
part of the array. That’s because the extra space is used as extra
parity data that is needed to survive the failure of one of the drives.
Initial set of LVM on top of RAID
Now that we have /dev/md0 device
you can create a Logical Volume on top of it. Why would you want
to do that?
If I were to build an ext3 filesystem on top of the RAID
device and someday wanted to increase it’s capacity I wouldn’t
be able to do that without backing up the data, building a new RAID
array and restoring my data. Using LVM allows me to expand
(or contract) the size of the filesystem without disturbing the
Anyway, here are the steps to then add this RAID array to the LVM
system. The first command pvcreate
will “initialize a disk or partition for use by LVM”. The second command
vgcreate will then create the Volume Group,
in my case I called it lvm-raid:
# pvcreate /dev/md0 # vgcreate lvm-raid /dev/md0
The default value for the physical extent size can be too low
for a large RAID array. In those cases you’ll need to specify the -s
option with a larger than default physical extent size. The default is
only 4MB as of the version in Fedora Core 5. The maximum number of
physical extents is approximately 65k so take your maximum volume size and divide
it by 65k then round it to the next nice round number.
For example, to successfully
create a 550G RAID let’s figure that’s approximately 550,000 megabytes and divide by
65,000 which gives you roughly 8.46. Round it up to the next nice round
number and use 16M (for 16 megabytes) as the physical extent size and you’ll be fine:
# vgcreate -s 16M
Ok, you’ve created a blank receptacle but now you have to tell how
many Physical Extents from the physical device (/dev/md0 in this case)
will be allocated to this Volume Group. In my case I wanted all the
data from /dev/md0 to be allocated to this Volume Group. If later I
wanted to add additional space I would create a new RAID array and add
that physical device to this Volume Group.
To find out
how many PEs are available to me use the vgdisplay
command to find out how many are available and now I can create a
Logical Volume using all (or some) of the space in the Volume Group.
In my case I call the Logical Volume lvm0.
# vgdisplay lvm-raid . . Free PE / Size 57235 / 223.57 GB # lvcreate -l 57235 lvm-raid -n lvm0
In the end you will have a device you can use very much like
a plain ‘ol partition called /dev/lvm-raid/lvm0.
You can now check on the status of the Logical Volume with the
lvdisplay command. The device can then
be used to to create a filesystem on.
# lvdisplay /dev/lvm-raid/lvm0 --- Logical volume --- LV Name /dev/lvm-raid/lvm0 VG Name lvm-raid LV UUID FFX673-dGlX-tsEL-6UXl-1hLs-6b3Y-rkO9O2 LV Write Access read/write LV Status available # open 1 LV Size 223.57 GB Current LE 57235 Segments 1 Allocation inherit Read ahead sectors 0 Block device 253:2 # mkfs.ext3 /dev/lvm-raid/lvm0 . . # mount /dev/lvm-raid/lvm0 /mnt # df -h /mnt Filesystem Size Used Avail Use% Mounted on /dev/mapper/lvm--raid-lvm0 224G 93M 224G 1% /mnt
Handling a Drive Failure
As everything eventually does break (some sooner than others) a drive
in the array will fail. It is a very good idea to run
smartd on all drives in your array (and probably
ALL drives period) to be notified of a failure or a pending failure as
soon as possible. You can also manually fail a partition,
meaning to take it
out of the RAID array, with the following command:
# /sbin/mdadm /dev/md0 -f /dev/hdb1 mdadm: set /dev/hdb1 faulty in /dev/md0
Once the system has determined a drive has failed or is otherwise missing
(you can shut down and pull out a drive and reboot to similate a drive
failure or use the command to manually fail a drive
it will show something like this in mdadm:
# /sbin/mdadm --detail /dev/md0 Update Time : Wed Jun 15 11:30:59 2005 State : clean, degraded Active Devices : 2 Working Devices : 2 Failed Devices : 1 Spare Devices : 0 . . Number Major Minor RaidDevice State 0 3 1 0 active sync /dev/hda1 1 0 0 - removed 2 33 65 2 active sync /dev/hdf1
You’ll notice in this case I had /dev/hdb
fail. I replaced it with a new drive with the same capacity and was
able to add it back to the array. The first step is to partition
the new drive just like when first creating the array. Then you
can simply add the partition back to the array and watch the status
as the data is rebuilt onto the newly replace drive.
# /sbin/mdadm /dev/md0 -a /dev/hdb1 # /sbin/mdadm --detail /dev/md0 Update Time : Wed Jun 15 12:11:23 2005 State : clean, degraded, recovering Active Devices : 2 Working Devices : 3 Failed Devices : 0 Spare Devices : 1 Layout : left-symmetric Chunk Size : 64K Rebuild Status : 2% complete . .
During the rebuild process the system performance may be somewhat
impacted but the data should remain in-tact.
Expanding an Array/Filesytem
I’m told it’s now possible to expand the size of a RAID array much as you
could on a commercial array such as the NetApp. The link below describes
the procedure. I have yet to try it but it looks promising:
Growing a RAID5 array – http://scotgate.org/?p=107
None yet, I’ve found the software RAID system to be remarkably stable.
Other Useful Resources
I’ve tried to not just copy other people’s tips so I’ve included a list
of other people’s tips and tricks I’ve found to be useful. There should
be little or no overlap.
Encrypting /home and swap over RAID with dm-crypt –
Do you have important company files on your PC at home, that you can neither afford to lose, nor let fall into the wrong hands?
This page explains how to set up encrypted RAID1 ext3 filesystems with dm-crypt, along with an encrypted RAID0 swap, on RedHat / Fedora Core 5, using the twofish encryption algorithm and dm-crypt’s new ESSIV mode.