Basic mdadm configuration and commands

From Notes_Wiki
Revision as of 05:36, 9 November 2012 by Saurabh (talk | contribs) (Created page with "=Basic mdadm configuration and commands= ==Seeing information on running array== To see information on currently configured and running raid arrays and also status of differ...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Basic mdadm configuration and commands

Seeing information on running array

To see information on currently configured and running raid arrays and also status of different devices in that raid array use:

more /proc/mdstat


Creating new raid array

To create a new array we can use '-C' option to create array. We also need to specify raid level using `-l' as raid 0, raid 1, raid 5 or raid 6. We also need to specify number of devices using -n. If we are going to add only one device then we have to specify `--force' before specifing `-n 1'. At end we have to specify devices to include in the array. Example commands are:

mdadm -C /dev/md3 -n2 -l1 /dev/xvdc1 /dev/xvdd1
mdadm -C /dev/md3 --force -n1 -l1 /dev/xvdc1


Adding device to existing array

To add device to existing array we can use `-a' option. Example command is

mdadm -a /dev/md3 /dev/xxdd1

Many times adding a new device requires creation of partitions similar to existing device on new device. For this one can refer to Fdisk_or_parted wiki page. It is recommended that while adding devices to array first add the new partition which looks smaller than existing array partitions. This way if partition is too small to be acceptable, then one would get error new device is not big enough to be added. This will allow adjusting of partition sizes for next try.


Stopping running array

To stop running array we can use option `-S'. Example command is

mdadm -S /dev/md3


Creating start-up configuration from running array

To create start-up configuration of running array we can use command `mdadm --detail -sv'. To store it in startup configuration we can use shell redirection as shown in below example

mdadm --detail -sv > /etc/mdadm.conf


Using RAID device

Once device /dev/md<n> is created we can use it like normal disk device. We can either format the entire device to store some filesystem like ext3 or we can partition the device and then format the individual partitions. For some reason kernel does not reads partitions of RAID device on using `partprobe' or even with reboot. Hence it is best to format entire device and use it as single partition.


Forcing re-sync of existing array

To re-sync existing array we can call script in '/etc/cron.weekly' which re-syncs all arrays. In case we are interested in re-syncing only one array then we can use:

echo repair > /sys/block/md<n>/md/sync_action

Here <n> can be replaced by 0, 1, 2, etc. based on which device we want to re-sync. We can check what is being done on array using either 'more /proc/mdstat' or 'cat /sys/block/md<n>/md/sync_action'.

After array is re-synced the value of 'mismatch_cnt' displayed using 'cat /sys/block/md1/md/mismatch_cnt' should be zero. If this is not zero then there is some problem with re-syncing. The cron script also warns if mismatch_cnt is non-zero after re-syncing. When this happens it could be a kernel bug where re-sync is also only checking arrays. Please update kernel and reboot machine to avoid data loss..


Configuring raid on server during installation

  1. Install OS on hard-disks with software raid partitions.
  2. Boot using 'Fedora resuce cd' or 'System rescue cd' and mark first partition (that is '/boot', or '/') in all hard-disks as bootable. (One can use fdisk and 'a' option). Hence it is important to have first partiition in raid 1 in all hard-disks, so that we can boot properly. Other partitions can be in raid 5 or 6 if required.
  3. Reboot system and it should boot properly



Booting after removing one hard-disk

When we try to boot after removing one hard-disk in some cases the system may boot properly. In other cases it would give error that partition or device '/dev/md<n>' does not exists. In such case we have to boot using linux installation cd in rescue mode and follow these steps:

  1. chroot /mnt/sysimage
  2. more /proc/mdstat
  3. mdadm --auto-detect
  4. more /proc/mdstat
  5. mdadm --detail -sv
  6. mdadm --detail -sv > /etc/mdadm.conf
  7. sync
  8. exit
  9. exit

After this the system should boot. In case you have added new spare hard-disk then you must partition it appropriately. Use 'partprobe' to detect new partitions and then use 'mdadm' as explained above to add these new partitions in existing raid array.



Reinstalling grub on raid partition

Sometimes just using auto-detect may not work and system may not boot when one of the raid hard-disks is removed. In such cases we need to re-install grub on raid partition. To do this boot using installation CD and go to rescue mode. Then use 'chroot /mnt/sysimage'. Then use something like 'grub-install /dev/md<n>' where md<n> is the raid 1 partition for boot. Use 'sync' to ensure cached blocks from RAM are flushed to hard-disk. Then use 'reboot' and now system should boot properly even with one of the raid devices removed.