Path: utzoo!attcan!uunet!snorkelwacker!usc!cs.utexas.edu!sun-barr!newstop!sun!opus!gingell From: gingell%opus@Sun.COM (Rob Gingell) Newsgroups: comp.arch Subject: mmap() vs. read() (Was: Re: the Multics from the black lagoon :-)) Message-ID: <131682@sun.Eng.Sun.COM> Date: 12 Feb 90 17:35:13 GMT References: <8859@portia.Stanford.EDU> <20571@watdragon.waterloo.edu> <1990Feb12.053616.11455@Solbourne.COM> <3556@rti.UUCP> <10468@alice.UUCP> Sender: news@sun.Eng.Sun.COM Reply-To: gingell@sun.UUCP (Rob Gingell) Organization: Sun Microsystems, Mountain View Lines: 122 In article <3556@rti.UUCP> trt@rti.UUCP (Thomas Truscott) writes: >[someone wrote a replacement for "sum", called "fastsum" that uses mmap().] > >You are comparing your efficient "fastsum" that happens to use mmap() >against a sluggardly "sum" that happens to use read(). >(Actually it uses getchar(), which calls _filbuf(), >maybe _filbuf() uses mmap()?!) As it happens, no. This is always a potential change, however we have not done so because to date we have not found that stdio would benefit from such a change -- the principal advantage would be to save buffer copy time and memory loading, however we haven't found a large population of programs where these factors are dominant. Perhaps it is because our stdio is so otherwise inefficient, perhaps it is because the applications themselves are inherently not I/O buffer copy limited, or perhaps simply because those programs that were already so limited long ago converted to direct read()/write() operations. >The following would be a more appropriate test: >Change your fastsum routine so that instead of mmap()ing >a megabyte at a time, it does a read() of a megabyte at a time. >Compare the mmap() and read() versions of this program. >I suspect you will find they take about the same amount of time. I don't think so. At the very least, the read() version will be slower than the mmap() version by the amount of time required to effect the copies from kernel to program buffers. And this assumes an "optimum" situation in which the overhead of buffer management in the kernel does not become significant -- which it will for a large amount of data. And it ignores the system effects of essentially doubling the memory load on the system for both the original file pages and the pages used to buffer the copies in the application. >On a Sparcstation 1, try timing "cp" vs. the following program: > > main() > { > char bfr[8192]; > register int n; > > while ((n = read(0, bfr, sizeof(bfr))) > 0) > write(1, bfr, n); > } > >I did "/bin/time cp /vmunix /tmp/x" >and "/bin/time a.out < /vmunix /tmp/x" several times. >The results were essentially identical. >(I did not experiment with buffer sizes, I suspect 16k would be faster.) I'd be astonished if the results did not always show that access through mmap() is faster (and they are for this program running on my 3/160.) To be a valid experiment, you should be sure that both /vmunix and /tmp/x are completely flushed from memory after each test run -- otherwise the system's buffering of the two files will skew the results. I've never observed a proper experiment in which mmap() was not faster, though the difference is not always large. >There is no inherent reason that read() should be slower >than mmap() for sequential I/O, since read() is doing precisely >what is wanted. Indeed read() should be faster since >it is conceptually simpler. Not true. read() operates by mmaping the file and copying it. And, due to limitations in the address space available inside the kernel, read() must often perform more, smaller "mmap()-like" chunk operations than a single application mmap() could use, using even more CPU time in the process. >Note that read() can be implemented with memory mapping, in some cases: >it could map the address of "bfr" to a copy-on-modify kernel page. This is also not true, though it is a common belief and one that arose repeatedly during development. read() gives you a copy of the file data at the time that the call is executed. That copy is immutable save any action performed by your program. If read() were implemented *as* mmap(), then while it is possible to deal with side effects introduced in *your* machine, it is not, in general, possible to deal with side effects introduced in other machines -- such as file modifications performed by DOS PC's living in your network. It might be possible to make such an assumption save for heterogeneous environments. However, it should be noted that neither MULTICS nor TENEX/TOPS-20 (the latter being the more direct parent of mmap(), with MULTICS as a more remote ancestor) attempted such an optimization either. >As others have pointed out, read() and write() are generally useful >on streams, and mmap() is not. >(The SunOS "cp" command falls back to read/write if mmap() fails. >But since read/write is as fast as mmap(), >why bother with mmap() in the first place?!) > >So what is mmap() good for? Plenty. >But it is NOT a replacement for read/write. Nor is it advertised as such. Though Mr. Truscott has not done so, those deprecating mmap() for not being "device independent" or lacking other attributes of read()/write() miss the point -- which was never that mmap() replace read() or write() or otherwise represent some "grail" in the search for computing enlightenment. Rather it was to provide an abstraction of operations in which the system was already engaged (namely file buffering and physical store multiplexing) in a way that was accessible to applications and which can increase their flexibility. A good test of the sufficiency of such an abstraction is that it is capable of becoming a primitive which you can use to replace older and various implementations with a common framework -- and in this we believe mmap() to have been a success. We also believe it to be an effective abstraction for those requiring its properties. But neither do we believe that everyone does, for mmap() is certainly a "lower-level" abstraction than read()/write(), a primitive out of which the latter can be constructed on memory objects in the same way device drivers provide a primitive for transfer operations. Because mmap() is *more* primitive than read()/write(), it can be (as Dennis Ritchie points out) more cumbersome to use than the equivalent sequence of read() or write() -- but so would access to raw devices. If you're programming around it, it's probably an indication that operating at this level of the system isn't suitable for your needs, you should use the higher abstractions. The fact that the system supplies an abstraction that isn't suitable for your use, does not lessen the fact that it is an effective abstraction for others as well as an effective one for the system to use in the implementation of abstractions that *are* appropriate for your use. It's been my experience that most frustrations in the use of memory mapping techniques in MULTICS, TENEX/TOPS-20, and now with mmap() have come from the expectation that somehow mmap() was a higher-level operation than it really is.