Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!ucsd!hub.ucsb.edu!spectrum.CMC.COM!lars From: lars@spectrum.CMC.COM (Lars Poulsen) Newsgroups: comp.unix.wizards Subject: Re: Checkpoint/Restart Message-ID: <1990Aug22.191258.19072@spectrum.CMC.COM> Date: 22 Aug 90 19:12:58 GMT References: <24239@adm.BRL.MIL> Organization: Rockwell CMC Lines: 87 In article <24239@adm.BRL.MIL> mike@BRL.MIL ( Mike Muuss) writes: > Checkpoint/restart in any non-trivial I/O environment is *hard*. True, indeed. It is fairly instructive to look at how (other?) commercial operating systems have dealt with this issue. As could be expected, there is a wide variety of checkpoint/restart implementations. The earliest checkpoint/restart implementations in the days of single-user machines were just memory dumps, with tape drive repositioning and a way to notify the application that it had been restarted. IBM70{4,9,40,90,94} type stuff. CDC3600 SCOPE. When direct-access storage came along, it was originally small, and used for temporary files; so it was copied to the checkpoint tape. The checkpoint system that I know best - UNIVAC 1100 EXEC-8 - is of this type. A checkpoint file is usually a tape file, containing a memory image, all spool files (input and output) and all temporary files. File pointers are not an issue, since all permanent disk files are direct access files (the read/write calls have a file position in them) so "file pointers" live in user space. Even so, the checkpoints were complex enough that my installation (an academic computing center) disabled the checkpoint facility since ill-structured checkpoint restarts often crashed the system. (How about restarting from a checkpoint taken on a different system - or before last week's sysgen). Interestingly enough, EXEC-8 retrograded in later releases to provide a lesser checkpoint (memory image only) known as a "partial checkpoint" as a cheaper and safer alternative. > ... The real source of difficulty in checkpoint/restart comes from >interfaces to "stateful" resources, like: Yes, there is a TON of state information to be preserved. For all but trivial tasks, this involves many megabytes of file space. > >*) Tape drives. Need to get the right reel back, in the right position. Easy, compared to the other stuff. >*) Terminals. All the terminal modes should be saved and restored. >What about other processes that might have come along in the meantime >and started using the terminal, on restart? Indeed, the semantics of shared terminal devices are a great source of implementation problems. This a probably a mis-feature. >*) Network connections. The system can't keep the connection ... Agreed. Other than the controlling terminal, network connections should be banned. And the controlling terminal should be a disconnectable virtual terminal like VMS' VTAxxx: device. >*) Temporary files. If the process depends on files in /tmp ... The biggest problem here, is that UNIX does not know the concept of temporary files. A _real_ temporary file is what you have after fd = creat("/tmp/xxxx" ... unlink("/tmp/xxxx"); But unix would have no way of restoring such a beast, I think. >Therefore, I assert that it is the state of the I/O system, not the state >of the UNIX processes, that is hard to checkpoint. Indeed, it is trivial >to checkpoint file pointers, PID's, and other aspects of the *process* >state. It isn't too hard to make sure that files have not changed >between checkpoint and restart times. But in many cases you DO want to change the file. Sometimes the failure you are recovering from was caused by bad data in a permanent file. You want to be able to fix the bad record and then restart from the last checkpoint before that record was seen. The biggest can of worms has not even been touched upon here: What about the state of a large DBMS that the checkpointed process may be accessing. Do you want to restore it to the state when the checkpoint was taken, thus backing out all updates since the large job failed ? When the job failed, were all transactions performed by the job backed out ? If so, the before-and-after-looks need to be part of the checkpoint so they can be re-installed. What if those records have been updated since the checkpoint ? The biggest jobs, which need checkpoints the most, provide the biggest cans of worms. -- / Lars Poulsen, SMTS Software Engineer CMC Rockwell lars@CMC.COM