Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!husc6!cmcl2!rutgers!ames!elroy!mahendo!jplgodo!wlbr!scgvaxd!trwrb!felix!martin From: martin@felix.UUCP (Martin McKendry) Newsgroups: comp.arch,comp.os.misc Subject: Re: Unix File System Performance Message-ID: <7327@felix.UUCP> Date: Mon, 14-Sep-87 17:17:54 EDT Article-I.D.: felix.7327 Posted: Mon Sep 14 17:17:54 1987 Date-Received: Fri, 18-Sep-87 07:06:40 EDT References: <1384@faline.bellcore.com> Sender: daemon@felix.UUCP Reply-To: martin@felix.UUCP (Martin McKendry) Organization: FileNet Corp., Costa Mesa, CA Lines: 72 Xref: mnetor comp.arch:2194 comp.os.misc:180 In article <1384@faline.bellcore.com> hammond@faline.UUCP (Rich A. Hammond) writes: >In article I wrote: >> >> ... A >>very high proportion of programs today are I/O bound -- a proportion >>that will increase as we get faster processors. It seems to me that >>filesystem performance is the next big area for competition. After >>all, that's what makes a mainframe a mainframe, right? >> >>Comments? > >About 98% of the programs run on our systems use <2 secs of 780 CPU time, >nor do they use very much I/O. There are only a few I/O or CPU hogs. >That's based on ~10 million process records using modified 4.2 BSD >accounting. Where do you get the idea that a high proportion are I/O bound? > From extensive workload analysis. Try putting in a make. Look at your CPU utilization. If its not 100%, you are waiting on I/O when you could be processing. Depending on how much you like to wait on I/O, you are I/O bound. To look at a single 780 is hardly representative of the world. Most of the world's data processing is production commercial data processing. We do image processing. Don't assume that your load is everyone's. In a previous life, I worked with extensive analyses of commercial customer workloads taken from real customers sites. Based on simulation results and real benchmarks, we found that you could make changes by large factors (2-5) in either direction in CPU performance without seeing anything like the same change in throughput (total time to run benchmark). Like a factor of 4 or 5 faster in CPU for only a factor of 2 change in throughput. Idle time on the faster CPU goes up as expected. This on batch processing with no terminal I/O. If that's not I/O bound, I don't know what is. Since CPU speed/$ is improving at a faster rate than the corresponding figure for disk, I'd expect the class of problems for which this occurs to increase. >On machines with 32+ MB of memory, I'm willing to bet that a large >proportion of all accesses are satisifed from the in core buffers, >i.e. my edit compile run edit cycles probably all run out of the in >core buffers once I've completed a cycle. If the system were smart >enough to use all of memory as disk buffer rather than 10% of it, I'm >certain that my stuff would just stay in core. > What if I want to support 400 users from one server, each of whom wants 50Kb of data every 15 seconds. Or if I have to process/merge two or three 60Mb data files? What if I don't want to ship 32 M on all machines? >I'll agree that file system performance could be improved, but I'm >inclined to believe that improving the use of main memory as buffers >would be a bigger general win than any changes to the disk layout. > What if I am planning to do both, and the incremental costs are worth it? >Does anybody have records for their general use systems that prove *********** By whose definition? >that the systems are I/O bound? I want at least a continuous month's >worth of records, no one or two day or "peak" samples. ***** Why not? Often its the peaks I want to handle. I can already handle the regular loads. I don't care for your tone. I don't think my posting warranted it. >Rich Hammond Bell Communications Research hammond@bellcore.com -- Martin S. McKendry; FileNet Corp; {hplabs,trwrb}!felix!martin Strictly my opinion; all of it