Xref: utzoo comp.unix.questions:24409 comp.sys.sequent:683 Path: utzoo!attcan!uunet!mcsun!ukc!strath-cs!baird!jim From: jim@cs.strath.ac.uk (Jim Reid) Newsgroups: comp.unix.questions,comp.sys.sequent Subject: Re: Checkpoints for large jobs Message-ID: Date: 7 Aug 90 14:32:06 GMT References: <3193@syma.sussex.ac.uk> Sender: jim@cs.strath.ac.uk Organization: Computer Science Dept., Strathclyde Univ., Glasgow, Scotland. Lines: 25 In-reply-to: william@syma.sussex.ac.uk's message of 6 Aug 90 16:06:18 GMT In article <3193@syma.sussex.ac.uk> william@syma.sussex.ac.uk (William Craven) writes: I was wondering whether there is a system which will allow a job to start off from when it last killed either by means of checkpointing or setjmp/longjmp. If there is such a scheme I would be grateful for pointers. Yes. Unix processes have variables end, etext and edata which are respectively the addresses at the end of the uninitialized data, text and data "segments" of its address space. All that's needed is to write out the data space and somehow bodge a stack pointer using setjmp/longjmp. When the process is restarted, it uses malloc to grow the data space if needed and then reads the file containing the dumpded data. The process then has to re-open the files it had open before the dump and then finally do a longjmp to put the stack back to a known state before resuming execution. See end(3). This is more or less what sendmail does to create a frozen configuration file. On Sequents, all bets are off if the process has used shared/private memory with lightweight processes created by m_fork(3). The formats of executable files and core dumps is given by the man pages for a.out and core, though these files are not nice to poke around in. Jim