Mdadm
Contents
Rebuilding and Creating RAID 1 Arrays with mdadm
Crating Arrays
To create a mirrored array with two drives, sda
and sdb
, on partitions, sda1
and sdb1
:
mdadm --create --verbose /dev/md0 --level=raid1 --raid-devices=2 /dev/sda1 /dev/sdb1
Now you can monitor the status of the building of the array with:
cat /proc/mdstat
Once finished, save your mdadm
configuration with:
mdadm --verbose --detail --scan > /etc/mdadm.conf
You may need to edit this file to remove unwanted lines or to add an email address to MAILADDR
to be notified if a drive failure occurs:
MAILADDR user1@dom1.com, user2@dom2.com
On some systems, mdadm
's configuration file is /etc/mdadm/mdadm.conf
, it is very important to put the configuration in the correct location.
Rebuilding Arrays
If a drive ever fails, or is the system is booted with a drive removed, you will need to add it back into the array.
Failed Drive
In this example, /dev/sda1
and /dev/sdb1
make up the RAID 1 array /dev/md0
. Let us say that /dev/sdb
fails.
Determining failed drive
Run
cat /proc/mdstat
[root@lo4 ~]# cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdb1[1](F) sda1[0] 204736 blocks super 1.0 [2/1] [U_] unused devices: <none>
When a drive fails or is missing, you will see an underscore in the array output ([U_]
instead of [UU]
). (F)
will be displayed next to the failed drive (sdb1[1](F)
).
If not, running lsblk
or fdisk -l
may help you determine which drive it is that failed
hdparm -I /dev/sda | grep "Serial Number"
Will give you the serial number of /dev/sda
, which may help you identify physical disks as well.
Remove Failed Drive
If a drive has has failed, it should be removed from the mdadm
array before being replaced.
mdadm --manage /dev/md0 --fail /dev/sdb1
[root@lo4 ~]# mdadm --manage /dev/md0 --fail /dev/sdb1 mdadm: set /dev/sdb1 faulty in /dev/md0
Now, we can remove it from the array.
mdadm --manage /dev/md0 --remove /dev/sdb1
[root@lo4 ~]# mdadm --manage /dev/md0 --remove /dev/sdb1 mdadm: hot removed /dev/sdb1 from /dev/md0
Check /proc/mdstat
. There should no longer be any (F)
or listed drive besides sda1[0]
.
Power down the system.
shutdown -h now
Replace Drive
Now that everything is powered down, remove the failed HDD then replace it with the new one.
Once the drive is replaced, boot the system back up.
Add New Drive to Array
Recreate the partitioning scheme of /dev/sda
on the new drive.
sfdisk -d /dev/sda | fdisk /dev/sdb
Then verify with lsblk
or fdisk -l
.
mdadm --manage /dev/md0 --add /dev/sdb1
[root@lo4 ~]# mdadm --manage /dev/md0 --add /dev/sdb1 mdadm: added /dev/sdb1
Finally, check the status of the rebuilding with
cat /proc/mdstat
[root@lo4 ~]# cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdb1[1] sda1[0] 204736 blocks super 1.0 [2/1] [U_] [===========>.........] recovery = 57.7% (118400/204736) finish=0.0min speed=118400K/sec unused devices: <none>
Missing Drive
In this example, /dev/sda1
and /dev/sdb1
make up the RAID 1 array /dev/md0
. Let us say that /dev/sdb1
is missing.
Use lsblk
to examine HDD partitions with block sizes. Alternatively, you can use fdisk -l
or any other utility you prefer.
Now check the status of mdadm
with:
cat /proc/mdstat
Personalities : [raid1] md0 : active raid1 sdb1[1](F) sda1[0] 204736 blocks super 1.0 [2/1] [U_] unused devices: <none>
When a drive fails or is missing, you will see an underscore in the array output ([U_]
instead of [UU]
).
Use the output from lsblk
and /proc/mdsat
to match the present drive in an active mdadm
array (/dev/md0
) with the corresponding partition on the missing drive. For example, "match" /dev/sda1
with /dev/sdb1
(after verifying their block sizes are the same).
Now add /dev/sdb1
back into the array:
mdadm --manage /dev/md0 --add /dev/sdb1
[root@lo4 ~]# mdadm --manage /dev/md0 --add /dev/sdb1 mdadm: added /dev/sdb1
You can view the status of the rebuilding array with:
cat /proc/mdstat
[root@lo4 ~]# cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdb1[1] sda1[0] 204736 blocks super 1.0 [2/1] [U_] [===========>.........] recovery = 57.7% (118400/204736) finish=0.0min speed=118400K/sec unused devices: <none>