From ThirdMartini

The stock implementation of the Linux Kernel Raid1 driver does not deal well with URE ( unrecoverable Read Errors ). When a request is made to a disk in a raid1 set that encounters an IO error ( typically a URE/bad sector ) the entire disk is deemed faulty and ejected from the raid set. This can be problamatic if the disks have been running for a long time as disk drives will develop sporadic bad sectors over time. It's a bad idea to eject an entire disk if there is a single read error, as the mirror disk may also have several bad sectors in nonover lapping areas of the 2 mirrors.

This patch modifies the error handling behavior of the raid1 driver by always retrying a failed read to the paired drive. If the retried read to the other drive is sucessfull we then write this buffer BACK to the drive that had the IO error. This will force the drive that encountered the IO error to remap the bad sector to it's spare list. We only eject a disk that had an IO error on 2 conditions, 1 the IO error was encountered during a WRITE, or the retried read also FAILED.

In the case of a WRITE io, if the IO failed the disk is either out of spare sectors, or just plain dead. ( IE controller failure, disk unplugged, etc )

In the case of a failed retry IO from the pair disk, we need to eject one of the drives to be able to continue so just handle the error in the same manner the stock raid1 driver would.

[edit] Download

    Filename Version Description
    raid-rr Patch 2.4.26-1 Raid 1 Read-Error Recovery Patch
    Personal tools