Path: utzoo!utgpu!news-server.csri.toronto.edu!clyde.concordia.ca!uunet!snorkelwacker!apple!amdcad!sun!snafu!lm From: lm@snafu.Sun.COM (Larry McVoy) Newsgroups: comp.arch Subject: Re: Mixing paging and IO is inefficient (was Re: Compiler partions) Message-ID: <138208@sun.Eng.Sun.COM> Date: 2 Jul 90 03:58:51 GMT References: <499@garth.UUCP> <5660@titcce.cc.titech.ac.jp> <137770@sun.Eng.Sun.COM> <1990Jun26.185232.3565@utzoo.uucp> Sender: news@sun.Eng.Sun.COM Reply-To: lm@sun.UUCP (Larry McVoy) Organization: Sun Microsystems, Mountain View Lines: 52 I said: >>... The synchronous >>nature of certain file system writes are *required* for file system >>reliability. Just so you understand: consider what happens when you create >>a file. You allocate an inode and add a directory entry. Think about the >>steps and the order of operations. If you do it wrong, and the system >>crashes, you leave dangling pointers... And Henry said: >The requirement here, however, is not that writes be done synchronously, >but that certain constraints on the *order* of writes be preserved, so >that the disk is always adequately consistent. And Henry is correct. A fairly straightforward way to improve performance would be to make sure that certain writes were done in the proper order. Henry goes on: >Look at the >Osterhout paper in the latest Usenix, comparing Sprite (delayed write) >vs. NFS (synchronous write) performance, with performance factors of >up to 50 in favor of Sprite. Osterhout commented "almost every file >you touch is a temporary file", and added that this applies to more >than just /tmp, so it is well worth postponing most writes in hopes >that the file will go away before you have to write it. This is a concocted benchmark. Ousterhout admitted as much. This business about "every file you touch is a temp file" is (A) not true and (B) a red herring. I'd like to see some data, taken from a whatever is a "normal" site, that shows the temp file stuff. My data indicates that a percentage of files created are temp, but certainly not all, not even 50%. It really depends what you are doing. Anyway, it's a moot point. Only the control information is written synchronously and that is such a minor part of what is going on that it's not really worth optimizing. Don't believe me, huh? Well, I don't entirely either. Here's the deal: things like compiles are not bound by the file system. A coworker was writing up a paper on tmpfs recently and benchmarked a kernel build with /tmp over tmpfs vs /tmp over ufs. It came out to 30 seconds difference. Over 45 minutes. Big deal. On the other hand, there are times when delaying makes a lot of sense. "rm *" for example, results in a lot of repeated writes to the same directory block. This is pretty stupid and should be fixed. Just don't believe everything that is implied in Ousterhout's paper. You can put in his file system and maybe that will make you feel better, but your program won't run much faster, if at all. --- Larry McVoy, Sun Microsystems (415) 336-7627 ...!sun!lm or lm@sun.com