logos

With Andrii Grytsenko


Technical Diary - With Andrii Grytsenko

Software RAID level 1 behavior

Here I will try to describe behavior of SW RAID level 1 in case of crash.

There will be two scenario:

  1. Delete partition by fdisk and restart the computer.
  2. Delete and create empty partition by fdisk and restart the computer.
  3. Delete whole storage device from the system and restart computer.
  4. Delete and create whole storage device from the system and restart computer.

1. Delete partition by fdisk and restart the computer.

First, I gotta create the raid device:

[root@node1 ~]# mdadm -C /dev/md0 -n 2 -l 1 /dev/hdd1 /dev/hdb8
mdadm: largest drive (/dev/hdb8) exceed size (476160K) by more than 1%
Continue creating array? y
mdadm: array /dev/md0 started.

and build ext3 file system on the device:

[root@node1 ~]# mke2fs -j /dev/md0

Edit /etc/fstab according to the changes:

/dev/md0                /test                   ext3    defaults        0 0

and mount everything is contained at /etc/fstab:

[root@node1 ~]#mount -a

Copy some files to /test(to be able to check files integrity in the future):

[root@node1 ~]# cp -rfv /var/* /test/

Now, delete /dev/hdd1 through fdisk:

[root@node1 ~]# fdisk  /dev/hdd

The number of cylinders for this disk is set to 8322.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)

Command (m for help): d
Selected partition 1

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.

and restart computer:

[root@node1 ~]# reboot

During the system booting I got next error(the same I got when try to manually mount md0):

mount: wrong fs type, bad option, bad superblock on /dev/md0,
       missing codepage or other error
       (could this be the IDE device where you in fact use
       ide-scsi so that sr0 or sda or so is needed?)
       In some cases useful info is found in syslog - try
       dmesg | tail  or so

/proc/mdstat doesn’t see any local raid devices:

[root@node1 ~]# cat /proc/mdstat
Personalities :
unused devices:

Let’s create new mdadm.conf :

[root@node1 ~]#  mdadm --examine --scan  /dev/hdb8 > /etc/mdadm.conf

[root@node1 ~]# cat /etc/mdadm.conf
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=c1ce0c10:035aa2a3:829450b6:84b7a236

And active everything from mdadm.conf:

[root@node1 ~]#  mdadm -A -s
mdadm: /dev/md0 has been started with 1 drive (out of 2).

The md0 was activated, but only with one disk:

[root@node1 ~]# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 hdb8[1]
      476160 blocks [2/1] [_U]

Conclusion: If it was a root partition your system wouldn’t boot at all. And you would have to dance with boot disk to make you system alive.

2. Delete and create empty partition by fdisk and restart the computer.

The same actions as above and additional create partition /dev/hdd1 in fdisk.
After system booted. None of raid’s are active :

[root@node1 ~]# cat /proc/mdstat
Personalities :
unused devices:

But after restore procedure system is going to work in full-fledged mode(with 2 disks):

[root@node1 ~]#  mdadm --examine --scan  /dev/hdb8 > /etc/mdadm.conf
[root@node1 ~]# mdadm -A -s
mdadm: /dev/md0 has been started with 2 drives.
[root@node1 ~]# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 hdd1[0] hdb8[1]
475136 blocks [2/2] [UU]

The new device has no influence on system behavior at all. Result is similar to previous one.

3. Delete whole storage device from the system and restart computer.

During this test the md0 was restored and mounted during system boot without any problem or delays. Also next notice message were generated and put into the syslog facility kernel:

kernel: raid1: raid set md0 active with 1 out of 2 mirrors

4. Delete and create whole storage device from the system and restart computer.


During boot raid wasn’t rebuilt by itself:

[root@node1 ~]# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 hdb8[1]
      475136 blocks [2/1] [_U]

And has to be re-built manually :

[root@node1 ~]# mdadm -a /dev/md0 /dev/hdd1
mdadm: re-added /dev/hdd1

Still recovering :) :

[root@node1 ~]# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 hdd1[0] hdb8[1]
      475136 blocks [2/1] [_U]
      [====>................]  recovery = 21.5% (103296/475136) finish=0.4min speed=12912K/sec

NOTE: partition type for new partition should be set to ‘Linux raid autodetect’ (FD in a hex code). Frankly speaking I didn’t test whether it works without that, that’s why it may be or may _not_ useless procedure. My advice is: just do it!

[root@node1 ~]# fdisk -l /dev/hdd

Disk /dev/hdd: 4294 MB, 4294967296 bytes
16 heads, 63 sectors/track, 8322 cylinders
Units = cylinders of 1008 * 512 = 516096 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/hdd1               1         943      475240+  fd  Linux raid autodetect

P.S. System information:

Tests were performed at virtual machine:
virtualbox-2.1.4
virtualbox-ose-guest-modules-2.6.26-1-686

Software installed inside VM:
CentOS(kernel-2.6.18-128.el5)
mdadm-2.6.9-2.el5

Categories

Translate