mysterious error messages, part 1

This may or may not be the first in a series of posts in which a strange unknown error is found, and a non-obvious solution is found.

This particular error message came after creating a software RAID device:


# mdadm --create /dev/md7 --level=1 --raid-devices=2 /dev/sdc1 /dev/sdd1
mdadm: /dev/sdc1 is too small: 0K

I had just partitioned the disks with fdisk and set the partition type; sfdisk -l on the disk gave the correct output. Nobody else appeared to provide a solution to this, even though a couple of posts with the same query went unanswered.

It turns out for the first time ever for me, despite the perpetual fdisk warning, the partition table didn’t get reread properly by the kernel when fdisk wrote out the new table. This only happened on sdc, not on sdd.

I figured this out with mke2fs’s much more explanatory error message:


# mke2fs /dev/sdc1
mke2fs 1.35 (28-Feb-2004)
mke2fs: Device size reported to be zero. Invalid partition specified, or
partition table wasn't reread after running fdisk, due to
a modified partition being busy and in use. You may need to reboot
to re-read your partition table.

The fix was to run fdisk one more time, and just say ‘w’ to write out the partition table again, and (more importantly) make the ioctl() call again to have the kernel reread the partition table, this time properly. The next step would have been to reboot if that didn’t work, but I didn’t want to. (As Saif said in the previous post, rebooting is for adding new hardware.)

As I find more of these non-obvious error messages and the solution, I’ll try to post about them. Hope this helps someone out.

ghetto raid scrubbing with linux

As a follow up to my adventures with Linux RAID scrubbing (or lack thereof), I decided to poke around a bit more this weekend after a filesystem started throwing some errors.

It appears that someone did fix at least part of the issue I ran into — a memcpy() was left out of the repair kernel code — but I’m not planning on installing that kernel for a while. Not without some serious testing, or perhaps after it’s applied in a RedHat/CentOS update kernel.

However, I did come up with something that may work as a very ghetto software RAID1 verification technique. (The following keywords should help someone google this post: linux software raid verify scrub oh shit.)

Here’s what you do. First, find the size of the mirror from /proc/mdstat:

md5 : active raid1 hdd1[1] hdb1[0]
      58613056 blocks [2/2] [UU]

Multiply the number of blocks by 1024:

[root@linux] # bc
bc 1.06
Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'. 
58613056*1024
60019769344

Then, run cmp on the two devices that make up the mirror:

[root@linux] #cmp /dev/hdb1 /dev/hdd1
/dev/hdb1 /dev/hdd1 differ: byte 60019769345, line 246090365

If the byte at which the two devices differ is a higher number than the one you came up with using bc, it means both mirrors contain the same data. (From what I can tell, that’s the area where the raid metadata/superblock sits, at the end of the disk.)

If the differ byte number is smaller, you can probably do a more extended test with cmp -l to find out what data differs and whether there are one or more differences. Not sure how to repair at that point; if you feel lucky, you might be able to do some kind of block editing (and guess the value that block should be), but I’m not about to try that part.

Part of the point of scrubbing is to read every byte of data from every disk and make sure there aren’t any read errors; if there are, it should throw a kernel error which shows up in logs, or with IDE might allow the drive firmware to reallocate a block that has a soft error in it (which will show up in smartd’s output).

Note that this will only work with RAID1; RAID5 lays out data differently, in stripes of data and parity, so you’d have to do parity calculations as well as figure out where they are. It could probably done with some programming, but that’s left as an exercise for the reader }:>.

So yeah, it’s really ghetto, but it appears to work. And now I don’t feel like I’m flying 100% blind and not knowing whether my mirrors are really mirrors. If I feel industrious, I’ll probably put this into a shell script and start running it weekly or something.