Path: utzoo!mnetor!tmsoft!torsqnt!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!tut.cis.ohio-state.edu!ucbvax!dog.ee.lbl.gov!elf.ee.lbl.gov!torek From: torek@elf.ee.lbl.gov (Chris Torek) Newsgroups: comp.unix.wizards Subject: Re: Why is restore so slow? Message-ID: <10003@dog.ee.lbl.gov> Date: 18 Feb 91 10:29:21 GMT References: <50235@olivea.atc.olivetti.com> <2880@redstar.cs.qmw.ac.uk> Reply-To: torek@elf.ee.lbl.gov (Chris Torek) Organization: Lawrence Berkeley Laboratory, Berkeley Lines: 47 X-Local-Date: Mon, 18 Feb 91 02:29:22 PST In article <2880@redstar.cs.qmw.ac.uk> liam@cs.qmw.ac.uk (William Roberts) writes: >Restore suffers from the fact that files are stored in inode-number order: >this is not the ideal order for createing files as it thrashes the namei-cache >because the files are recreated randomly all over the place. Well, no and yes. While the files are indeed in inode order, and the restore program (as opposed to the old `restor' program) does recreate them in this order, the Fast File System tends to set things up so that all the files in any one directory are in the same cylinder group as that directory. Depending on cylinder group sizes this may or may not overload the name cache, since only the directory parts of the names are cached (each trailing name is unique within its directory, but the directory must be searched anyway to verify this first). More important are two other facts: - Each directory must be scanned entirely (to make sure the name is unique); - Directory operations are synchronous. The latter is usually the performance-killer since the directory blocks tend to remain in the buffer cache. Directory writes are done synchronously to make crash recovery possible. Ordered (but otherwise delayed) writes should give the same effect with a much smaller performance penalty; this is being investigated. >/usr/spool/news/comp/unix/internals/5342 and this took an incredibly long time >to restore. /usr/mail contains several hundred files but no subdirectories and >restored in about the same sort of time as it took to dump. The presence or absence of subdirectories is largely irrelevant: the problem is the large number of files. One big file restores much faster than several dozen small files, even though both take the same amount of space, because one big file equals one synchronous directory write (preceded by one synchronous inode write) followed by many asynchronous data writes. If you do many full file system restores, it would probably be worth your effort to make a kernel that does delayed writes for inode and directory operations, and run it (or enable delayed writes on each file system in question) each time you do such a restore. If the system crashes, you can just start over. -- In-Real-Life: Chris Torek, Lawrence Berkeley Lab EE div (+1 415 486 5427) Berkeley, CA Domain: torek@ee.lbl.gov