Nov 21 2010
It turned out in the end that one of the disks was passing all SMART tests, but yet an area of the surface was no longer recording correctly – I ended up writing zeros to the entire disk, but then found that an area several hundred megabytes in size and several hundred megabytes in was reading back 0x07fs. Note that even RAID5/6 arrays with distributed parity only use this data for reconstruction if a disk has failed: reads are not parity-checked under usual operating conditions, and so corruption can still creep in… as unfortunately happened in this case.
“colossus” – now up to Mk3 by my counting – is the system in the Yeong Yang YY-0221 chassis, which currently consists of a 3.2GHz Phenom II X4 955 Black Edition with 4GB of ECC DDR3 memory in a Gigabyte GA-MA790FXT-UD5P mainboard.
“alexandria”, first mentioned in this post over four years ago now, is at the top of the pile. It’s a low-power 1.2GHz VIA Nehemiah C7 Mini-ITX board paired with a Marvell-based SuperMicro AOC-SAT2-MV8 connected to a standard 32bit 33MHz (133MB/s) PCI socket. This machine actually boots from compact-flash mirrored into memory in order to kickstart the array. The wooden box is a self-made enclosure (originally from and abortive attempt to build an entire case from first-principles) for the 5x 1TB disks of the array, with the other visible disks being the 5x 250GB drives which made up the original incarnation of the same array. Checking the new content against the old was invaluable… when 1TB drives were the biggest on the market, just how to you backup 4TB of data?