Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!elroy.jpl.nasa.gov!sdd.hp.com!zaphod.mps.ohio-state.edu!unix.cis.pitt.edu!dsinc!netnews.upenn.edu!vax1.cc.lehigh.edu!lusol From: lusol@vax1.cc.lehigh.edu Newsgroups: comp.arch Subject: Re: What you need for crash recovery Message-ID: <195.27b13234@vax1.cc.lehigh.edu> Date: 7 Feb 91 15:55:47 GMT Organization: Lehigh University Lines: 43 Jim Giles talks about U*X job recovery after a system crash, line disconnect, or a program error such as a divide fault. The various replies indicate that U*X does a poor job in these situations, athough some flavors can handle some of these situations properly. For instance Dik T. Winter mentions that UNICOS has automatic job recovery after a crash, and Colin Plumb mentions undump and Mach's macho file format for recovering an aborted job. But it seems there is NO CONSISTENCY in the U*X world with regard to job recovery in general. There is an operating I use that handles all of these situation very nicely. 1) Job recovery after a system crash The operating system supports active job recovery, there is no need to periodically write checkpoint files on a job by job basis. It even recovers after most hardware failures except a loss of power to the machine room. With this feature you can deadstart the machine at any time for whatever reason and not lose your ANSYS and ADINA grinders. 2) Job recovery after a line disconnect When you login the operating system automatically displays a list of your detached jobs. You either select a detached job or continue your current session. There is no way another user can gain control of your job. 3) Program recovery after a fault Run under control of the debugger and you can change the necessary variables and restart the job. Simple. No undumping and converting core files to executables. My question: what is the situation in the U*X world with regard to these three problems? Which flavors can do what? Is there any flavor that handles these situations as nicely as CDC's NOS/VE OS (-:? Steve Lehigh University Computing Center Stephen.O.Lidie@CDC1.CC.Lehigh.EDU