Raid 1 Boot Drive Best Practices

Using Raid 1 as the Boot Drive

Hello! I was curious if anyone else had experience using two disk (ssds) in a raid one for a node’s boot drive. I know when you deploy ubuntu server manual, the subiquity installer makes you create an EFI partition on each of the drives then create a raid 1 from the disk’s other partition. MAAS doesn’t seem to let you mark more than one disk as bootable. I wasn’t sure if when you create a raid 1 and one of them is selected as the boot disk if MAAS handles that for you or if your node won’t boot if the primary disk fails. Thanks!

I also have this same question. I have never found clear recommendations/procedures using software RAID1 for boot on an EFI machine that can tolerate EITHER drive failing properly.

1 Like

On legacy systems we use just 500mb of /boot mounted as ext4 and then a 10G / . The rest would be /var.

On newer systems (e.g. testing now with AlmaLinux) the /boot is not necessary so we switched to only / and /var using xfs.

1 Like

It seems that you still need to create a boot partition for else the deployment will fail. I just now tried deploying a server were both boot SSD are in a RAID 1, then setup to use LVM with an XFS filesystem mounted at /

Solution:

Here’s what I did to get it to work.

  1. Create a 1024MB partition on each of the drives. On sda1, I mounted it at /boot/efi as FAT32. I left sdb1 as unmounted and unformatted.
  2. I then created a second partition on each of the drives, using up the remaining space on each of the drives but leaving them unformatted.
  3. A raid 1 volume was created from sda2 and sdb2, also unformatted
  4. A volume group was then created by selected the md0 volume.
  5. After that, I created the logical volume and mounted it at / as XFS
  6. The machine was deployed using ubuntu 22
  7. After the machine was deployed, I ran dd if=/dev/sda1 of=/dev/sdb1 to copy the boot partition from the primary boot drive to the secondary.

Testing

  1. I tested the setup by removing the drive sled from the server will it was running. I got a warning from letting me know that the drive was removed.
  2. I installed some additional packages and rebooted the server with /dev/sda still removed from the server. The server was able to boot from sdb, however once it was booted it assumed the role of sda.
  3. I reinserted the primary drive and it was mounted as /dev/sdh. From there, I re-added the device using mdadm --add /dev/md0 /dev/sdh2 and watched the process using watch cat /proc/mdstat.
  4. After a few minutes. the rebuild was successful!
  • Keep in mind, since I did not install a fresh drive, there may be a few more steps to format a new drive before it can be used to replace a failed drive in a live scenario.

I’m still not sure if this is the best practice for deploying a software raid 1 as the boot drive. Hardware raid may be more robust, but I’m using other drives on the same HBA for a ceph deployment so my raid card had to be configured in HBA mode. I would still like to hear what everyone else is using. Thanks for the read and I hope you found it useful!

1 Like

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.