I was attempting to recover a array of discs from an (inherited) SAN that had failed. Unfortunately there were no backups available so I was on my own! The inner workings of the SAN was locked down - so I knew little about the data structure on the disk themselves - but I knew that the SAN run on an old Linux kernel at the very least.
The array consisted of x4 500GB drives in a RAID5 setup.
After plugging the drives into a server and booting up a live Debian system I firstly attempted to scan for the RAID devices:
sudo apt-get update && sudo apt-get install mdadm -y
mdadm --assemble --scan
This failed - stating
I proceeded by querying the discs for the SMART data with smartctl in case any of the discs had any failures:
sudo apt-get install smartmontools
smartctl -a /dev/sd[abcd]
Unfortunately the last disc failed SMART:
SMART overall-health self-assessment test result: FAILED!
Time was clearly against me.. I proceeded by querying the proc fs to retrieve data about the RAID devices:
cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [raid0]
md0 : inactive sdc4[1](S) sdd4[3](S) sda4[2](S) sdb4[0](S)
1949109760 blocks super 0.91
md101 : inactive sdb5[0](S) sdc5[3](S) sda5[2](S) sdd5[1](S)
1092096 blocks super 0.91
md100 : inactive sda2[3](S) sdd2[2](S) sdb2[0](S) sdc2[1](S)
2184448 blocks super 0.91
Here we can see the data drive (made up of sda4,sdb4,sdc4,sdd4). Also note the numbers wrapped around the square brackets - these numbers indicate the order of the discs in the array.
The output indicates the discs are 'inactive' / not initialized.
We can also collect additional information about the RAID discs with:
mdadm -E /dev/sda
/dev/sda:
MBR Magic : aa55
Partition[0] : 32083 sectors at 47 (type 83)
Partition[1] : 1092420 sectors at 32130 (type 83)
Partition[2] : 1092420 sectors at 1124550 (type 05)
Partition[3] : 974555127 sectors at 2216970 (type 83)
/dev/sda:
MBR Magic : aa55
Partition[0] : 32083 sectors at 47 (type 83)
Partition[1] : 1092420 sectors at 32130 (type 83)
Partition[2] : 1092420 sectors at 1124550 (type 05)
Partition[3] : 974555127 sectors at 2216970 (type 83)
I was specifically interested in Partition 3 (xfs) - so we can do:
mdadm -E /dev/sda4
/dev/sda4:
Magic : a92b4efc
Version : 0.91.00
UUID : 6210c5a6:a386fad4:4714843b:49d8ab79
Creation Time : Mon Jun 18 03:38:13 2007
Raid Level : raid5
Used Dev Size : 487277440 (464.70 GiB 498.97 GB)
Array Size : 1461832320 (1394.11 GiB 1496.92 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 0
Reshape pos'n : 0
New Level : raid0
New Layout : left-asymmetric
New Chunksize : 0
Update Time : Wed Jan 2 01:52:19 2002
State : active
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Checksum : df64ea6b - correct
Events : 90
Layout : left-symmetric
Chunk Size : 64K
Magic : a92b4efc
Version : 0.91.00
UUID : 6210c5a6:a386fad4:4714843b:49d8ab79
Creation Time : Mon Jun 18 03:38:13 2007
Raid Level : raid5
Used Dev Size : 487277440 (464.70 GiB 498.97 GB)
Array Size : 1461832320 (1394.11 GiB 1496.92 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 0
Reshape pos'n : 0
New Level : raid0
New Layout : left-asymmetric
New Chunksize : 0
Update Time : Wed Jan 2 01:52:19 2002
State : active
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Checksum : df64ea6b - correct
Events : 90
Layout : left-symmetric
Chunk Size : 64K
...
This provides some interesting information like the RAID level, members, number of devices in the array (including active ones) and the reshape position.
After some reading I discovered you can force the assembly through with a bogus backup file - e.g.:
After checking dmesg I noticed the following error message:
[60967.198812] md/raid:md0: not clean -- starting background reconstruction
[60967.198819] md/raid:md0: unsupported reshape required - aborting.
[60967.198819] md/raid:md0: unsupported reshape required - aborting.
Since there was zero information about this error on the Internet I ended up looking through the source code found here:
The error message gets triggered when 'mddev->new_level != mddev->level'. mddev is a struct that holds information about a RAID device. So basically it's telling us that if the existing RAID level does not equal to the 'new' proposed level an error should be thrown.
This prompted me to go back over the earlier 'mdadm -E' (examine) output again and low and behold I noticed that although the existing RAID level was set to RAID5 (as expected) - the 'New Level' was set to RAID0!
So clearly it was failing because the conversion of RAID 5 to RAID 0 was not possible. But more importantly I was concerned about why this was happening in the first place!
I ended up recreating the array with (note that this command will not delete the data on the array itself:
sudo mdadm --create /dev/md0 --metadata=0.91 --assume-clean --verbose --level=5 --raid-devices=4 /dev/sd[abcd]4 --chunk=64KB
Note: Ensure that the meta data, drives, raid level and chunk size are specified! You can get this information from the examine switch e.g. mdadm -E /dev/sda4.
Finally this command mounted the RAID device:
mdadm: layout defaults to left-symmetric
mdadm: /dev/sda4 appears to be part of a raid array:
level=raid5 devices=4 ctime=Mon Jun 18 03:38:13 2007
mdadm: /dev/sdb4 appears to be part of a raid array:
level=raid5 devices=4 ctime=Mon Jun 18 03:38:13 2007
mdadm: /dev/sdc4 appears to be part of a raid array:
level=raid5 devices=4 ctime=Mon Jun 18 03:38:13 2007
mdadm: /dev/sdd4 appears to be part of a raid array:
level=raid5 devices=4 ctime=Mon Jun 18 03:38:13 2007
mdadm: size set to 487277440K
mdadm: automatically enabling write-intent bitmap on large array
Continue creating array? (y/n) y
mdadm: /dev/sda4 appears to be part of a raid array:
level=raid5 devices=4 ctime=Mon Jun 18 03:38:13 2007
mdadm: /dev/sdb4 appears to be part of a raid array:
level=raid5 devices=4 ctime=Mon Jun 18 03:38:13 2007
mdadm: /dev/sdc4 appears to be part of a raid array:
level=raid5 devices=4 ctime=Mon Jun 18 03:38:13 2007
mdadm: /dev/sdd4 appears to be part of a raid array:
level=raid5 devices=4 ctime=Mon Jun 18 03:38:13 2007
mdadm: size set to 487277440K
mdadm: automatically enabling write-intent bitmap on large array
Continue creating array? (y/n) y
mdadm: array /dev/md0 started.
cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [raid0]
md0 : active raid5 sdd4[3] sdc4[2] sdb4[1] sda4[0]
1461832320 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
bitmap: 0/4 pages [0KB], 65536KB chunk
md0 : active raid5 sdd4[3] sdc4[2] sdb4[1] sda4[0]
1461832320 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
bitmap: 0/4 pages [0KB], 65536KB chunk
I then checked the partition table:
fdisk /dev/md0
However with no luck:
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) p
Error: /dev/md0: unrecognised disk label
Model: Linux Software RAID Array (md)
Disk /dev/md0: 1497GB
Sector size (logical/physical): 512B/512B
Partition Table: unknown
(parted) p
Error: /dev/md0: unrecognised disk label
Model: Linux Software RAID Array (md)
Disk /dev/md0: 1497GB
Sector size (logical/physical): 512B/512B
Partition Table: unknown
As I was confident that there were partitions on this disk I used a tool called 'testdisk' to help me identify lost partitions:
apt-get install testdisk
As it's an interactive application I have described the process flow below:
testdisk >> Create >> 'Disk /dev/md0' >> 'Intel / PC' >> 'Analyze' >> 'Quick Search' >> Select the partition and then hit 'Write'.
The above process identified that there was an XFS partition.
Disk /dev/md126 - 1496GB / 1394 GiB - CHS 22841130 32 4
Partition Start End Size in sectors
Linux 4099 0 1 31 4 2922905600
Verify the partition with parted:
parted /dev/md0
GNU Parted 3.2
Using /dev/md0
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) p
Model: Linux Software RAID Array (md)
Disk /dev/md0: 1497GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags:
Number Start End Size Type File system Flags
1 269MB 1497GB 1497GB primary xfs boot
Using /dev/md0
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) p
Model: Linux Software RAID Array (md)
Disk /dev/md0: 1497GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags:
Number Start End Size Type File system Flags
1 269MB 1497GB 1497GB primary xfs boot
Check the filesystem for errors:
apt-get install xfsprogs
xfs_repair -n /dev/md0p1
Phase 1 - find and verify superblock...
xfs_repair: V1 inodes unsupported. Please try an older xfsprogs.
Phase 1 - find and verify superblock...
xfs_repair: V1 inodes unsupported. Please try an older xfsprogs.
Darn. So I ended up downloading a really old version of a CentOS live DVD from here:
Firstly identify the RAID device (as this might have changed):
cat /proc/mdinfo
and we'll then attempt to repair the filesystem again (note: we are going to perform a read-only check firstly!)
xfs_repair -n /dev/md0p1
This time xfs_repair was failing to find the superblock - so I suspected that (maybe) the start / end sector was incorrect - mostly due to the fact the 'testdisk' wasn't sure about the amount of tracks per cylinder - which changing effected the start and end sectors). I ended up running another utility 'UFS Explorer RAID Recovery' - and it identified different start / end sector:
So I decided to manually re-create the partition table:
fdisk /dev/md0p1
# delete the partition
d
# create a new partition
mkpart
pri
1
start sector: 524672
end sector: +2922905600 (this is the sector size - NOT the actual end sector)
# write changes
w
and again we attempt to run xfs_repair:
xfs_repair -n /dev/md0p1
Phase 1: Find a verify superblock ... success!
Phase 2:
...
It looks much better now - however from the output I'd clearly lost some files.
Warning: At this point you should have a block level backup of all of the disks in your array (ideally you should do this before doing anything) - as from this point on you can really screw up / easily lose all of your data.
We'll now try a read/write repair on the filesystem:
xfs_repair /dev/md0p1
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed. Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair. If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.
be replayed. Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair. If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.
mkdir -p /mount/recovery
mount -t auto /dev/md0p1 /mount/recovery
mount: Structure needs cleaning
So it looks like we are going to have to discard the logs this time:
xfs_repair -L /dev/md0p1
Finally mount it with:
mount -t auto /dev/md0p1 /mount/recovery
No comments:
Post a Comment