Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!usc!julius.cs.uiuc.edu!rpi!uwm.edu!rutgers!cbmvax!daveh From: daveh@cbmvax.commodore.com (Dave Haynie) Newsgroups: comp.sys.amiga Subject: Re: Re: A3000UX Seems Fated (Kill file alert!) Message-ID: <17189@cbmvax.commodore.com> Date: 7 Jan 91 20:03:19 GMT References: <139982@pyramid.pyramid.com> Reply-To: daveh@cbmvax.commodore.com (Dave Haynie) Organization: Commodore, West Chester, PA Lines: 117 In article <139982@pyramid.pyramid.com> telam@pyrps5.pyramid.com (Thomas Elam) writes: >In article <17084@cbmvax.commodore.com> daveh@cbmvax.commodore.com (Dave Haynie) writes: >>That's hardly the only issue, though, if you're also considering running >>something like UNIX on this system. Assume a basic 1.5 MB/s SCSI drive. On >>the ISA bus, any PC is going to spend at least 40% of any given transfer >>period fetching data from the SCSI bus. That's assuming a fully buffered, >>interrupt driven SCSI controller that funnels the 8 bit SCSI data into 16 bit >>data for the ISA bus transfer. >Why is this, Dave? Because you are talking about a non-DMA design? >Could the ISA bus-based PC spend less than the 40% by using a DMA design? That's a rough figure based on what most PCs actually do these days. They use the host CPU, which it typically between 8MHz and 33MHz and has fast memory to talk to. At the high end, there is no real difference between DMA and non-DMA as long as the ISA bus is run at the standard 4MHz. You calculate the effective bandwidth in terms of bus crossings. For a true DMA device, you have one bus crossing per word, so that's an effective 4MB/s over the ISA bus. That's 37.5% of the available bus bandwidth, assuming a 1.5 MB/s SCSI device, plus you add in a little CPU time anyway to set up and service the DMA, so I rounded it up to 40%. Real DMA on the ISA bus is weird, and apparently there are lots of problems with it. I don't know all the details; some companies use it, some don't, and a few both use it and warn against using it (Chips and Technologies seems to fall into this last category). In any case, a fast 32 bit processor won't do much worse with a programmed I/O SCSI device, assuming of course it buffers and funnels the 8 bit SCSI data into 16 bit ISA bus data. Using a good block move operation, the '386 will get very close to the ideal of 1 bus crossing per word you get with DMA. It'll read two 16 bit words over the ISA bus, write one 32 bit word to its very fast memory on its private 32 bit bus. You can't get any better than the 37.5% of the ISA bus with ideal DMA, but you can get close. Exactly the same principles apply on the Amiga -- DMA devices are all under 50% of the bus bandwidth, but on a 68000 based system, non-DMA devices can come too close to using all 100% of the bus bandwidth. Replace the 68000 with a fast 68030 with 32 bit memory, and all of a sudden, the difference between true Zorro II DMA and programmed I/O drops to near insignificance. That is, until you add in real 32 bit DMA on a faster bus, which is just what we did in the A3000. And the same thing that happens when you use EISA, MCA, or a private 32 bit bus for DMA on a PC. You can grab DiskSpeed 3.1, set it up for CPU intensive operation or run a CPU time monitor as a separate task, and see the effect of this, spelled out in black and while (or color) for you, on different Amiga systems. And you'll also notice the difference in common use. Unfortunately, under MS-DOS, as long as you keep up with the disk drive, you aren't going to see any difference between a good and bad disk interface. In fact, it's possible to make a bad one that goes faster than a good one, simply by running pure polled I/O rather than using interrupts and hardware buffering (some companies do this on the Amiga too -- I call these "parasitic" controllers). So most companies don't supply you with a good hard disk interface on a PClone, simply because the additional cost and effort doesn't show up until you add UNIX or OS/2 or some real OS. >>If you were just running MS-DOS or something, the two approaches would give >>you the same performance, >Because MS-DOS (or at least the non-interrupt-driven processing) just >initiates the transfer then just spins in a loop waiting for the >transfer to complete? Is that the way the PC's BIOS works? I don't know, but that's not the reason. The reason is that you're single tasking. On the Amiga, even if a single I/O operation is done synchronously, causing that particular task to block for the duration of the I/O operation, other tasks are free to run, and you'll generally notice the overall performance increase. If all your other tasks froze for a second or two whenever a disk operation happened, you'd notice it, believe me. On a single tasking system, you don't have the option of asynchronous I/O, of course, and since the one task must block until the data it needs is loaded, the preceived performance of the whole system, same as that of the one task, is based on the disk speed, not the transfer efficiency. Macs, at least up until the IIfx, work much the same way, concentrating on single task efficiency. >If so, I guess a getc() (get character) function would have to wait for a >whole block to be read before returning a single character, correct? I would expect so. >> but the former is going to >>drastically bog down under UNIX, especially if you get into much paging >>activity. >Because it couldn't do its normal thing of scheduling another process, >right? Well, in any multitasking OS, one task blocking for I/O should, if anything, free up CPU time for the other competing tasks. If your whole system blocks, then for the time it takes the I/O operation to complete, you are effectively a single tasking system. In simple terms, a 6 MIPS fully blocking '386 or Mac system would get no work done during a 1 second I/O operation, while a 6 MIPS A3000 would get about 5,640,000 average instructions executed. I mention UNIX specifically, since [a] it pages, so system throughput can be much more diskbound than nonpaging systems, and [b] its large, especially in it's latest V.4 incarnation with all the bells and whistles running (X, Open Look, NFS, etc), so it tends to page lots unless you have enough real memory. >>Dave Haynie Commodore-Amiga (Amiga 3000) "The Crew That Never Rests" >I really appreciate your explanations, Dave. If I could get you to >answer these questions I have, I would feel almost enlightened! I like to explain some of these important but more esoteric hardware related issues, because I rarely seem them explained correctly in print or much elsewhere. In that most PCs in use are singletasking, programmed I/O, and often only 16 bit, these issues are often glossed over. >Tom Elam -- Dave Haynie Commodore-Amiga (Amiga 3000) "The Crew That Never Rests" {uunet|pyramid|rutgers}!cbmvax!daveh PLINK: hazy BIX: hazy "Don't worry, 'bout a thing. 'Cause every little thing, gonna be alright" -Bob Marley