I run RAID1 on all of the machines I support. While such hard disk mirroring is not a replacement for having good working backups, it means that a single drive failure is not going to force me to have to spend lots of time rebuilding a machine.
The best possible time to set this up is of course when you first install the operating system. The Debian installer will set everything up for you if you choose that option and Ubuntu has alternate installation CDs which allow you to do the same.
This post documents the steps I followed to retrofit RAID1 into an existing Debian squeeze installation. Getting a mirrored setup after the fact.
Overview
Before you start, make sure the following packages are installed:
apt-get install mdadm rsync initramfs-tools
Then go through these steps:
- Partition the new drive.
- Create new degraded RAID arrays.
- Install GRUB2 on both drives.
- Copy existing data onto the new drive.
- Reboot using the RAIDed drive and test system.
- Wipe the original drive by adding it to the RAID array.
- Test booting off of the original drive.
- Resync drives.
- Test booting off of the new drive.
- Reboot with the two drives and resync the array.
(My instructions are mostly based on this old tutorial but also on this more recent one.)
1- Partition the new drive
Once you have connected the new drive (/dev/sdb
), boot into your system and use one of cfdisk
or fdisk
to display the partition information for the existing drive (/dev/sda
on my system).
The idea is to create partitions of the same size on the new drive. (If the new drive is bigger, leave the rest of the drive unpartitioned.)
Partition types should all be: fd
(or "linux raid autodetect").
2- Create new degraded RAID arrays
The newly partioned drive, consisting of a root and a swap partition, can be added to new RAID1 arrays using mdadm
:
mdadm --create /dev/md0 --level=1 --raid-devices=2 missing /dev/sdb1
mdadm --create /dev/md1 --level=1 --raid-devices=2 missing /dev/sdb2
and formatted like this:
mkswap /dev/md1
mkfs.ext4 /dev/md0
Specify these devices explicitly in /etc/mdadm/mdadm.conf
:
DEVICE /dev/sda* /dev/sdb*
and append the RAID arrays to the end of that file:
mdadm --detail --scan >> /etc/mdadm/mdadm.conf
dpkg-reconfigure mdadm
You can check the status of your RAID arrays at any time by running this command:
cat /proc/mdstat
3- Install GRUB2 on both drives
The best way to ensure that GRUB2, the default bootloader in Debian and Ubuntu, is installed on both drives is to reconfigure its package:
dpkg-reconfigure grub-pc
and select both /dev/sda
and /dev/sdb
(but not /dev/md0
) as installation targets.
This should cause the init ramdisk (/boot/initrd.img-2.6.32-5-amd64
) and the grub menu (/boot/grub/grub.cfg
) to be rebuilt with RAID support.
4- Copy existing data onto the new drive
Copy everything that's on the existing drive onto the new one using rsync
:
mkdir /tmp/mntroot
mount /dev/md0 /tmp/mntroot
rsync -auHxv --exclude=/proc/* --exclude=/sys/* --exclude=/tmp/* /* /tmp/mntroot/
5- Reboot using the RAIDed drive and test system
Before rebooting, open /tmp/mntroot/etc/fstab
, and change /dev/sda1
and /dev/sda2
to /dev/md0
and /dev/md1
respectively.
Then reboot and from within the GRUB menu, hit "e" to enter edit mode and make sure that you will be booting off of the new disk:
set root='(md/0)' linux /boot/vmlinuz-2.6.32-5-amd64 root=/dev/md0 ro quiet
Once the system is up, you can check that the root partition is indeed using the RAID array by running mount
and looking for something like:
/dev/md0 on / type ext4 (rw,noatime,errors=remount-ro)
6- Wipe the original drive by adding it to the RAID array
Once you have verified that everything is working on /dev/sdb
, it's time to change the partition types on /dev/sda
to fd
and to add the original drive to the degraded RAID array:
mdadm /dev/md0 -a /dev/sda1
mdadm /dev/md1 -a /dev/sda2
You'll have to wait until the two partitions are fully synchronized but you can check the sync status using:
watch -n1 cat /proc/mdstat
7- Test booting off of the original drive
Once the sync is finished, update the boot loader menu:
update-grub
and shut the system down:
shutdown -h now
before physically disconnecting /dev/sdb
and turning the machine back on to test booting with only /dev/sda
present.
After a successful boot, shut the machine down and plug the second drive back in before powering it up again.
8- Resync drives
If everything works, you should see the following after running cat /proc/mdstat
:
md0 : active raid1 sda1[1]
280567040 blocks [2/1] [_U]
indicating that the RAID array is incomplete and that the second drive is not part of it.
To add the second drive back in and start the sync again:
mdadm /dev/md0 -a /dev/sdb1
9- Test booting off of the new drive
To complete the testing, shut the machine down, pull /dev/sda
out and try booting with /dev/sdb
only.
10- Reboot with the two drives and resync the array
Once you are satisfied that it works, reboot with both drives plugged in and re-add the first drive to the array:
mdadm /dev/md0 -a /dev/sda1
Your setup is now complete and fully tested.
Ongoing maintenance
I recommend making sure the two RAIDed drives stay in sync by enabling periodic RAID checks. The easiest way is to enable the checks that are built into the Debian package:
dpkg-reconfigure mdadm
but you can also create a weekly or monthly cronjob which does the following:
echo "check" > /sys/block/md0/md/sync_action
Something else you should seriously consider is to install the smartmontools
package and run weekly SMART checks by putting something like this in your /etc/smartd.conf
:
/dev/sda -a -d ata -o on -S on -s (S/../.././02|L/../../6/03)
/dev/sdb -a -d ata -o on -S on -s (S/../.././02|L/../../6/03)
These checks, performed by the hard disk controllers directly, could warn you of imminent failures ahead of time. Personally, when I start seeing errors in the SMART log (smartctl -a /dev/sda
), I order a new drive straight away.
Hello, thanks for this entry. I too run raid1 on all my Debian systems so I'll add comments from my experience:
If you don't set the metadata to 0.90 during raid creation you'd be better of with 1.* (I use 1.2), then using "fd" as filesystem type isn't needed anymore. All my systems run with 1.2 metadata and boot just fine without "fd" as file-system type.
If you want to have "md0" "non-partitionable" (not really) raid devices you could set --auto=md to avoid messing things up if you assemble raid arrays on another system (recovery, ...etc).
_Lastly using the --homehost and --name bits to identify the raid devices is coming in handy when juggling with many devices and switching systems. Easier to use than the array's UUID.
All the best.
Great how to! On Ubuntu I needed to create /etc/sys before it would boot using the degraded RAID, and substitute sd2 with sd5 everywhere. Thanks for the help!
FYI, an easy way to copy the partitions initially is:
sfdisk -d /dev/sda > partition.txt
sfdisk --force /dev/sdb < partition.txt
I was having a bear of a time getting step 5 to work. (Kernel panics and/or /dev/md0 missing on boot and subsequently being dumped to the initramfs prompt.)
Turned out that step 4 was doing nothing raid related for me. Which I solved with the following 2 steps.
1) Kept the initrd generated during "dpkg-reconfigure mdadm".
2) "grub-install --modules='raid proc_msdos ext2'" to both drives so grub could find md0. (substituting proc_msdos and ext2 with current insmod entries in you grub's menu items)
if you do that on Natty, make sure getting the grub2 common packages from oneric!!!
grub2 from natty has a bug, update grub2-pc packages and grub2-common
http://packages.ubuntu.com/de/oneiric/grub2-common
http://packages.ubuntu.com/de/oneiric/grub-pc
http://packages.ubuntu.com/de/oneiric/grub-common
and install grub again to /dev/sdx /dev/sdy
Great guide, specially about how you play it safe, and fail check end the end.
But I had to supplement it with this guide: http://www.howtoforge.com/how-to-set-up-software-raid1-on-a-running-system-incl-grub2-configuration-ubuntu-10.04-p1
And I can add that, its also possible to make mdadm daemon send email about any fails if the happens. A nice warning system.
Awesome tutorial, thank you. I used this to migrate an old Ubuntu 10.04 system with minor changes:
ln -s /sbin/MAKEDEV /dev/MAKEDEV
On ubuntu 12.04:
...This should cause the init ramdisk (/boot/initrd.img-2.6.32-5-amd64) and the grub menu (/boot/grub/grub.cfg) to be rebuilt with RAID support....
IT DOES NOT.
Helped me with some steps from the older manual http://wiki.xtronics.com/index.php/Raid
# mount -o bind /dev /tmp/mntroot/dev
# mount -t proc none /tmp/mntroot/proc
# mount -t sysfs none /tmp/mntroot/sys
# chroot /mntroot
# /usr/share/mdadm/mkconf > /etc/mdadm/mdadm.conf
# update-initramfs -u -k all
# update-grub
updated /etc/fstab with the new UUID's (blkid /dev/md0) and installed grub first only on the raid disc (sdb) with
# grub-install /dev/sdb
Disconected the old disc (sda), the system booted from the degraded raid array.
Hope it helps someone.
Hi,
first of all: Thank you for this manual.
In Step 5 (Reboot using the RAIDed drive and test system) there has to be an enhancement.
Before "set root=..:" you have to add the following two lines
so that the whole change is looking like
cheers
blockdev --flushbufs [device]
for your involved devices. After that everything worked perfectly.Hi, I found the Grub reconfig too complex and not working well in case the /boot is on a separate partition, failing to rescue mode.
Instead of fiddling with the grub console, one can fix the issue before reboot - just to chroot into the mounted md partitions (be aware, CHOOSE TO INSTALL GRUB TO MD-ENABLED DRIVE ONLY, just not to touch the "source" drive):
I consider this approach to be much cleaner.
I've follow this tutorial, everything went OK but grub. This part was difficult because of two reasons: 1) I use the GPT and the error was that boot partition doesn't have the bios_grub flag - don't forget to assign when create partitions on the new disk
2) the new 16.04 system has different raid modules to be loaded at grub start like dm_raid, megaraid etc. This part should be updated.
I have follow the guid to do in debian 9 , everything works perfectly , thanks ! but in the modify /tmp/mntroot/etc/fstb section, debian 9 use UUID instead of sda , and everything rest were almost the same as the guide .
Hi,
Thanks for the article! I followed it and it was my primary reference. However, I did encounter two issues while following it that I thought I should leave with you here. I'm using Ubuntu 18.04.3 and fdisk does not have 'fd' to convert a Linux partition to a Linux RAID partition. It is now '29' for a Linux RAID partition. Also, after adding the original disk to the RAID array and rebooting I found myself in grub rescue mode. After researching more I discovered that I was required to install a module 'mdraid1x' in order for grub to be able to read the RAID array and find the kernel in order to boot. Therefore, I had to boot a Live USB Ubuntu system to access the RAID array before chroot and grub-install --modules='mdraid1x' /dev/sda (to both drives). And finally, when I test each drive separately to see if it will boot (it does).
Anyway, thanks for the article.
I've followed the instructions, but when it comes to the "mdadm --create /dev/md0 --level=1 --raid-disks=2 missing /dev/sdh1" command, I get:
mdadm: partition table exists on /dev/sdh1
mdadm: partition table exists on /dev/sdh1 but will be lost or meaningless after creating array
mdadm: Note: this array has metadata at the start and may not be suitable as a boot device. If you plan to store '/boot' on this device please ensure that your boot-loader understands md/v1.x metadata, or use --metadata=0.90
Continue creating array?
...I answer no at that point, but what should I do?
My OS is AlmaLinux, but I would think the instructions would apply. It uses an LVM partition system, but other instructions I've found that include LVM have the same steps.
Thanks for any help.