Recovering data from a software (md) RAID array ~ Peter Manton

I was attempting to recover a array of discs from an (inherited) SAN that had failed. Unfortunately there were no backups available so I was on my own! The inner workings of the SAN was locked down - so I knew little about the data structure on the disk themselves - but I knew that the SAN run on an old Linux kernel at the very least.

The array consisted of x4 500GB drives in a RAID5 setup.

After plugging the drives into a server and booting up a live Debian system I firstly attempted to scan for the RAID devices:

sudo apt-get update && sudo apt-get install mdadm -y

mdadm --assemble --scan

This failed - stating

I proceeded by querying the discs for the SMART data with smartctl in case any of the discs had any failures:

sudo apt-get install smartmontools

smartctl -a /dev/sd[abcd]

Unfortunately the last disc failed SMART:

SMART overall-health self-assessment test result: FAILED!

Time was clearly against me.. I proceeded by querying the proc fs to retrieve data about the RAID devices:

cat /proc/mdstat

Personalities : [raid6] [raid5] [raid4] [raid0]
md0 : inactive sdc4[1](S) sdd4[3](S) sda4[2](S) sdb4[0](S)
      1949109760 blocks super 0.91

md101 : inactive sdb5[0](S) sdc5[3](S) sda5[2](S) sdd5[1](S)
      1092096 blocks super 0.91

md100 : inactive sda2[3](S) sdd2[2](S) sdb2[0](S) sdc2[1](S)
      2184448 blocks super 0.91

Here we can see the data drive (made up of sda4,sdb4,sdc4,sdd4). Also note the numbers wrapped around the square brackets - these numbers indicate the order of the discs in the array.

The output indicates the discs are 'inactive' / not initialized.

We can also collect additional information about the RAID discs with:

mdadm -E /dev/sda
/dev/sda:
   MBR Magic : aa55
Partition[0] :        32083 sectors at           47 (type 83)
Partition[1] :      1092420 sectors at        32130 (type 83)
Partition[2] :      1092420 sectors at      1124550 (type 05)
Partition[3] :    974555127 sectors at      2216970 (type 83)

I was specifically interested in Partition 3 (xfs) - so we can do:

mdadm -E /dev/sda4

/dev/sda4:
          Magic : a92b4efc
        Version : 0.91.00
           UUID : 6210c5a6:a386fad4:4714843b:49d8ab79
Creation Time : Mon Jun 18 03:38:13 2007
     Raid Level : raid5
Used Dev Size : 487277440 (464.70 GiB 498.97 GB)
     Array Size : 1461832320 (1394.11 GiB 1496.92 GB)
   Raid Devices : 4
Total Devices : 4
Preferred Minor : 0

Reshape pos'n : 0
      New Level : raid0
     New Layout : left-asymmetric
New Chunksize : 0

    Update Time : Wed Jan 2 01:52:19 2002
          State : active
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
       Checksum : df64ea6b - correct
         Events : 90

         Layout : left-symmetric
     Chunk Size : 64K

...

This provides some interesting information like the RAID level, members, number of devices in the array (including active ones) and the reshape position.

After some reading I discovered you can force the assembly through with a bogus backup file - e.g.:

mdadm --assemble --verbose --invalid-backup --backup backup.txt --force /dev/md0 /dev/sd[adbc]4

After checking dmesg I noticed the following error message:

[60967.198812] md/raid:md0: not clean -- starting background reconstruction
[60967.198819] md/raid:md0: unsupported reshape required - aborting.

Since there was zero information about this error on the Internet I ended up looking through the source code found here:

https://github.com/torvalds/linux/blob/master/drivers/md/raid5.c

The error message gets triggered when 'mddev->new_level != mddev->level'. mddev is a struct that holds information about a RAID device. So basically it's telling us that if the existing RAID level does not equal to the 'new' proposed level an error should be thrown.

This prompted me to go back over the earlier 'mdadm -E' (examine) output again and low and behold I noticed that although the existing RAID level was set to RAID5 (as expected) - the 'New Level' was set to RAID0!

So clearly it was failing because the conversion of RAID 5 to RAID 0 was not possible. But more importantly I was concerned about why this was happening in the first place!

I ended up recreating the array with (note that this command will not delete the data on the array itself:

sudo mdadm --create /dev/md0 --metadata=0.91 --assume-clean --verbose --level=5 --raid-devices=4 /dev/sd[abcd]4 --chunk=64KB

Note: Ensure that the meta data, drives, raid level and chunk size are specified! You can get this information from the examine switch e.g. mdadm -E /dev/sda4.

Finally this command mounted the RAID device:

mdadm: layout defaults to left-symmetric
mdadm: /dev/sda4 appears to be part of a raid array:
       level=raid5 devices=4 ctime=Mon Jun 18 03:38:13 2007
mdadm: /dev/sdb4 appears to be part of a raid array:
       level=raid5 devices=4 ctime=Mon Jun 18 03:38:13 2007
mdadm: /dev/sdc4 appears to be part of a raid array:
       level=raid5 devices=4 ctime=Mon Jun 18 03:38:13 2007
mdadm: /dev/sdd4 appears to be part of a raid array:
       level=raid5 devices=4 ctime=Mon Jun 18 03:38:13 2007
mdadm: size set to 487277440K
mdadm: automatically enabling write-intent bitmap on large array
Continue creating array? (y/n) y

mdadm: array /dev/md0 started.

cat /proc/mdstat

Personalities : [raid6] [raid5] [raid4] [raid0]
md0 : active raid5 sdd4[3] sdc4[2] sdb4[1] sda4[0]
1461832320 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
bitmap: 0/4 pages [0KB], 65536KB chunk

I then checked the partition table:

fdisk /dev/md0

However with no luck:

Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) p
Error: /dev/md0: unrecognised disk label
Model: Linux Software RAID Array (md)
Disk /dev/md0: 1497GB
Sector size (logical/physical): 512B/512B
Partition Table: unknown

As I was confident that there were partitions on this disk I used a tool called 'testdisk' to help me identify lost partitions:

apt-get install testdisk

As it's an interactive application I have described the process flow below:

testdisk >> Create >> 'Disk /dev/md0' >> 'Intel / PC' >> 'Analyze' >> 'Quick Search' >> Select the partition and then hit 'Write'.

The above process identified that there was an XFS partition.

Disk /dev/md126 - 1496GB / 1394 GiB - CHS 22841130 32 4

Partition Start End Size in sectors

Linux 4099 0 1 31 4 2922905600

I suspect this disk might have been part of an LVM setup (hence the missing partition table!)

Verify the partition with parted:

parted /dev/md0

GNU Parted 3.2
Using /dev/md0
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) p
Model: Linux Software RAID Array (md)
Disk /dev/md0: 1497GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags:

Number Start End     Size    Type     File system Flags
1      269MB 1497GB 1497GB primary xfs          boot

Check the filesystem for errors:

apt-get install xfsprogs

xfs_repair -n /dev/md0p1
Phase 1 - find and verify superblock...
xfs_repair: V1 inodes unsupported. Please try an older xfsprogs.

Darn. So I ended up downloading a really old version of a CentOS live DVD from here:

http://mirror.nsc.liu.se/centos-store/6.0/isos/x86_64/CentOS-6.0-x86_64-LiveDVD.iso

Firstly identify the RAID device (as this might have changed):

cat /proc/mdinfo

and we'll then attempt to repair the filesystem again (note: we are going to perform a read-only check firstly!)

xfs_repair -n /dev/md0p1

This time xfs_repair was failing to find the superblock - so I suspected that (maybe) the start / end sector was incorrect - mostly due to the fact the 'testdisk' wasn't sure about the amount of tracks per cylinder - which changing effected the start and end sectors). I ended up running another utility 'UFS Explorer RAID Recovery' - and it identified different start / end sector:

So I decided to manually re-create the partition table:

fdisk /dev/md0p1

# delete the partition

# create a new partition

mkpart

pri

start sector: 524672

end sector: +2922905600 (this is the sector size - NOT the actual end sector)

# write changes

and again we attempt to run xfs_repair:

xfs_repair -n /dev/md0p1

Phase 1: Find a verify superblock ... success!

Phase 2:

...

It looks much better now - however from the output I'd clearly lost some files.

Warning: At this point you should have a block level backup of all of the disks in your array (ideally you should do this before doing anything) - as from this point on you can really screw up / easily lose all of your data.

We'll now try a read/write repair on the filesystem:

xfs_repair /dev/md0p1

ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed. Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair. If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.

Well attempting to mount the file system yields:

mkdir -p /mount/recovery

mount -t auto /dev/md0p1 /mount/recovery

mount: Structure needs cleaning

So it looks like we are going to have to discard the logs this time:

xfs_repair -L /dev/md0p1

Although this completed successfully unfortunately the file structure was not preserved and pretty much everything ended up in a 'lost+found' directory. On the other hand I was only searching for one (gigantic) file - so locating it was not that hard fortunately!

Finally mount it with:

mount -t auto /dev/md0p1 /mount/recovery

Peter Manton :: Tech Notes

Friday, 24 August 2018

Recovering data from a software (md) RAID array

0 comments:

Post a Comment