Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!watmath!clyde!burl!ulysses!mhuxr!mhuxn!ihnp4!houxm!whuxl!whuxlm!akgua!gatech!seismo!brl-smoke!smoke!speck@vlsi.caltech.edu From: speck@vlsi.caltech.edu Newsgroups: net.unix-wizards Subject: rwhod, creat slowness Message-ID: <789@brl-smoke.ARPA> Date: Sun, 9-Feb-86 21:21:24 EST Article-I.D.: brl-smok.789 Posted: Sun Feb 9 21:21:24 1986 Date-Received: Wed, 12-Feb-86 08:02:38 EST Sender: news@brl-smoke.ARPA Lines: 46 About a month ago I discovered that 80% of all disk I/O done on our Suns was the single, simple line (in rwhod.c): whod = creat(path, 0666); where path = "/usr/spool/rwho/rwhod.%s" (%s = hostname). How could this innocent-looking line be such a hog? 1) Each machine executed it 18 times per minute (we have 18 rwhod's running on one net) 2) All those directories had to be looked up each time 3) On Suns, /usr/spool is a symlink to /private/usr/spool, adding another 3 directories to be looked up 4) On Suns, /usr and /usr/spool sit on a Network FileSystem. Sun's NFS has no caching in the clients; each lookup requires a server transaction over the network 5) 14 Suns used the /usr network filesystem As you can see, namei() was getting a LOT of overuse - on filenames that were very repetitious. I can't help wondering if rwhod wasn't the major contributor to the statistics that led Berkeley to implement namei() caching for 4.3bsd. Did they check the contents of the cache for suspicious correlations? The fix for this (chdir to the rwho directory) is a lot simpler and more efficient than namei caching - all of the directory traversals, symlink lookups, and NFS activity simply GO AWAY. But there's more: 6) Once it's got the inode, creat() takes 30ms, mostly I/O time, just to truncate the file - taking four times as long as an open() - and most of the time the file is going to be written to the same size as it was before. Why is creat(), probably one of the top 10 system calls, so slow on 4.2bsd systems? Why is ftruncate just as slow - and still takes 30ms even if the file is already the correct size? Apparently these system calls do *synchronous* I/O, ignoring the buffer cache (even on plain VAX 4.2bsd, without any NFS clouding the issue). Has Berkeley accomplished nothing in their meddling with the filesytem? Don Speck seismo!cit-vax!speck or speck@vlsi.caltech.edu