Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!apple!sun-barr!texsun!texbell!killer!elg From: elg@killer.DALLAS.TX.US (Eric Green) Newsgroups: comp.arch Subject: Re: DMA on RISC-based systems Message-ID: <8327@killer.DALLAS.TX.US> Date: 10 Jun 89 00:18:27 GMT References: <26636@ames.arc.nasa.gov> Organization: The Unix(R) Connection, Dallas, Texas Lines: 62 in article <26636@ames.arc.nasa.gov>, lamaster@ames.arc.nasa.gov (Hugh LaMaster) says: >>I have performed a small test on a DECstation3100 with a RZ55-230 Mb disk. > >>Write 15x2 Mb: 113 s, 265 kb/s >>Read 15x2 Mb: 117 s, 256 kb/s >>Read 15x2 + write 15x2 Mb (new and in parallell): 281 s, 213 kb/s >>Mean value: 234 kb/s Note that this is probably not an accurate account of disk drive bandwidth at all. Unix (at least older AT&T versions) DMA their data into the disk cache, then has the CPU manually copy it into the user's own buffer. With a plain-jane ST157N and a non-DMA SCSI controller pushed by a plain old 8mhz 68000, I get 550K/second (at least until my disk gets fragmented). And there are still visible pauses where the 68000 takes a while to digest the data. Another (DMA) disk controller gets 650K/second out of the same disk drive (of course, a 68020 or faster processor wouldn't have run out of steam like my 68000, so this isn't really an argument of DMA is better than CPU driven). Strangely enough, I have never seen anything on preferential caching schemes for file systems. You'd want to cache small I/O requests, as is currently done... but what about the scientific types who want to stream in a few megabytes of data, crunch on it, then stream it back out -- as fast as possible? That'd blow any reasonable cache to pieces. You'd want to DMA it straight into the user's memory. Or even use CPU-driven IO straight into the user's memory... you'd still come out at least as well as the traditional DMA-it-to-cache-then-copy-it. Thinking on it a bit, seems you'd want to cache only small I/O requests that don't overwhelm the amount of cache you have, while DMA'ing large I/O requests straight into the user's memory ASAP. That way crontab, whotab, and other small files hit fairly often would stay cached longer. An interesting problem... I suppose it irritates the designers of these disk subsystems that all their beautiful bandwidth is chewed to shreds by OS overhead. > On > mainframes, I have seen single applications which *averaged* 3 MB/sec on > 4.5 MB/sec channels on 8 simultaneous data streams. Which particular mainframes? Sounds like something a Cray could do... very little overhead there at all (don't have to cope with memory protection, can DMA straight into the user's data space without worrying about how "real" memory maps into the user's "virtual" memory, etc.). Sounds to me like another speed reason for Crays to not have virtual memory :-) (for the old veterans of past comp.arch discussions). Have to consider all aspects of the architecture, including disk subsystem performance, not just what it looks like from a user or CPU point of view. > So, the ratios quoted seem reasonable to me. Yes, seems reasonable to me too. But somewhat sad, considering the performance that the hardware is capable of. -- Eric Lee Green P.O. Box 92191, Lafayette, LA 70509 ..!{ames,decwrl,mit-eddie,osu-cis}!killer!elg (318)989-9849 "I have seen or heard 'designer of the 68000' attached to so many names that I can only guess that the 68000 was produced by Cecil B. DeMille." -- Bcase