Path: utzoo!utgpu!watmath!watdragon!rose!ccplumb From: ccplumb@rose.waterloo.edu (Colin Plumb) Newsgroups: comp.arch Subject: Slow SCSI Keywords: Peripheral Controllers, Gather, Scatter, I/O Architecture Message-ID: <18292@watdragon.waterloo.edu> Date: 17 Nov 89 17:35:54 GMT References: <35985@ames.arc.nasa.gov> Sender: daemon@watdragon.waterloo.edu Reply-To: ccplumb@rose.waterloo.edu (Colin Plumb) Organization: U. of Waterloo, Ontario Lines: 72 In article <35985@ames.arc.nasa.gov> lamaster@ames.arc.nasa.gov (Hugh LaMaster) writes: > 1) Why are SCSI disk subsystems *so* slow. On sequential reads, ~300 KB/sec. > is typical. The filesystems and all hardware interfaces should be able to > easily sustain 3X as much on sequential reads of large, unfragmented files. > (Which SCSI disk subsystems you ask? I would rather turn it around, and say, > are there any exceptions to the above sweeping generalization? :-) I use > SMD disks as a baseline...) It's a common question. I worked on the design of one SCSI subsystem where we benchmarked a few local machines (VAX, Sequent) and got quite awful performance figures. 250K/sec read/write average or so. Of course, all the Amiga fans out there get to repeat what they said a month or two ago and point out that a 7.14 MHz 68000 somehow manages to leave a sun 4/xxx in its dust, in this one area at least. 800 KB/sec is typical for contiguous files with decent hardware. One common reason is that the file systems are bloody awful. The BSD "Fast File System" will get, on a good day, 25% of the available disk bandwidth. A lot of it is just fragmentation, doing reads a block at a time (so you have to turn around the SCSI bus and issue a new command to get the next 8K, during which time, you might just miss the next sector), and copying in the buffer cache. Copying large expanses of bytes around is just stupid if it can be avoided. Did you have a look at the system time for copying a 1 meg file to /dev/null? On the 4.3 BSD microvax I'm on, it's 0.7 seconds. (5 seconds real, but the load average is 2.7 and I shouldn't be reading news.) Why on earth does it take 700,000 instructions to find 256 4K blocks? I assume "cp" and /dev/null aren't doing anything grotesquely stupid, but I may be wrong... anyway, *something* takes far too long. > 2) Is the reason lack of gather/scatter on (inexpensive) SCSI controllers? It's complicated. Requiring blocks be above a certain size helps, and most systems just want the page size, ayway, but it adds complexity to the incrementer hooked up to the address bus usually used. I like it, however. It saves the system architect from either having to subvert the page allocator to get contiguous physical pages for a long data transfer (tricky on writes when you don't know beforehand that the user will be writing from that memory), do lots of data copying, or issue I/O requests a page at a time. > 3) Are there implementation limitations of the new RISC-based systems which > make I/O cost more than on older systems? (e.g. VAX or 68K). Is there > something about SCSI in particular that is a problem? No, it's just that the processors cost *less*, so I/O cost gets squeezed the same amount, but the high-volume chips that enable performance at low cost aren't as easy to find. Also, people have discussed the microcomputer origins of lot of RISC designs, while good I/O subsystems are the realm of Big Iron and all things IBM-ish. Asynchronus SCSI isn't blindingly fast, but it seems to be faster than the bit clocks on a lot of small (300 MB and under - isn't it wonderful how terms change?) 5.25" hard drives. It's true that SCSI hardware also has a microcomputer heritage, so people rave about how much faster it is than ST506 and don't bother to notice how much slower it is than an IBM channel. > 5) (Not really a comp.arch question, but related to the above - aside: > Does anyone make synchronous SCSI disks which really perform?) I don't know. If you find one, tell me. I want something that will feed me a track at 2.5MB/sec. This is related to the above in that a lot of hard drives don't run their bit clocks at over 15 MHz, making it tricky to get 2MB/sec out of them... Otherwise, I have to play with RAID. Arrays of cheap disks is one nice idea, like the CM's data vault. Want 100 MB/sec? Get 100 SCSI drives and run them in parallel. -- -Colin