Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!tut.cis.ohio-state.edu!ucbvax!agate!saturn!mcvax!tel2.vtt.fi!savela@uunet.UU.NET
From: mcvax!tel2.vtt.fi!savela@uunet.UU.NET (Markku Savela)
Newsgroups: comp.os.research
Subject: Re: References for Fault Tolerent, "safe" file system
Message-ID: <7640@saturn.ucsc.edu>
Date: 23 May 89 20:12:54 GMT
Sender: usenet@saturn.ucsc.edu
Organization: Technical Research Centre of Finland
Lines: 26
Approved: comp-os-research@jupiter.ucsc.edu


In article <7597@saturn.ucsc.edu>, moscom!adp@cs.rochester.edu (Alan Percy) writes:
> 
> We where going to use dual hard disks and controllers.  The system
> would have the dual media and a driver that would write to both,
> but read from only one.  If a media failure was detected the
> backup disk would be read from.  The bad track on the primary would
> be reassigned and rewritten with data from the backup.

    This method was an option in a PDP-11 based multiuser operating
system which we designed in 70's in my earlier employment. One
additional detail has to be noted

   - if media failure is detected no futher attempts should be
     done on this disk. System should revert to backup only.
     All kind of havoc may result if the failure is transient..

    The "dual write"-option wasn't very popular, although some
sites used it. The trouble was just those transient error (or
someone hitting "write protect" or "off line" accidentally.
System reverted fully to backup and nobody noticed anything.
And, naturally nobody read the error messages from the console
and the next time system was booted, users had trashed disks,
because primary disk was again in use... :-(  I guess the
backup disk should have had some mark that the primary has
been dropped, but we never got to implement that.