Xref: utzoo comp.sys.amiga:17505 comp.sys.amiga.tech:248 Path: utzoo!mnetor!uunet!cbmvax!daveh From: daveh@cbmvax.UUCP (Dave Haynie) Newsgroups: comp.sys.amiga,comp.sys.amiga.tech Subject: Re: 68030 Questions Message-ID: <3609@cbmvax.UUCP> Date: 11 Apr 88 18:30:23 GMT References: <4937@videovax.Tek.COM> Organization: Commodore Technology, West Chester, PA Lines: 61 in article <4937@videovax.Tek.COM>, stever@videovax.Tek.COM (Steven E. Rice, P.E.) says: > Another possibility is to block the data into (e.g.) 512 byte blocks and > then arbitrate for the bus once per block. This drops the bus bandwidth > occupation to 20% (since one arbitration is insignificant compared to the > time to transfer 512 bytes as 128 32-bit words). But the CPU is still > denied the bus 20% of the time. First of all, with a better bus design (eg, not the current Amiga bus, but perhaps a future version that's 32 bits wide), there's zero or very near zero arbitration time; the bus's owner is determined dynamically on a cycle by cycle basis. Secondly, since the 68020 with cache running only wants the bus 50% or so of the time, on average, you take your 20% figure and immediately reduce it to 10%, on average. It could be as bad as 20%, it could be as good as 0%, depending on what the CPU is doing. Now we add a priotity scheme. If the CPU operation is more important, it gets the bus for any cycles it needs, and the DMA device gets whatever it wants from the remaining 50% of the bus. And that's assuming that the bus is limited to CPU bus speeds. It's pretty simple to make DMA devices run nybble or page mode cycles that the CPU can't keep up with, but most memory systems can be designed with this in mind for nearly free. So with DMA going with a nybble transfer, you're now down to less than 5% of the bus bandwidth for that transfer. VME and non-Apple NuBus both do things like this. > Given just a single hard disk transfer as you have described it, DMA into > a dual-port buffer avoids losing 20% of the CPU's processing capability. > That seems worthwhile to me! But you're still missing the point. The CPU has to stop what it's doing to transfer the data by hand. If it did that JUST as efficiently as the DMA device, you'd still be loosing whatever CPU time you claim is being eaten by the DMA transfer, 20% or whatever (keep in mind this 20% figure only applies during an actual transfer). If the DMA transfer happens twice as fast as the CPU could transfer the data, then I'm gaining in CPU speed, even though I'm kicking the CPU off the bus for awhile. DMA transfers on the Amiga bus with a 68020 go twice as fast as the 68020 could possibly transfer them. 68000 based CPU transfers are more like 1/4th the speed of the DMA device. My point is that someone has to do the work of transfer unless you can live with the data exactly where it's dumped in your shared memory scheme. If you know there's no transfer required, share the memory, but if there is, and especially if the memory can be used as is, once it reaches it's destination (like NewFS), DMA wins. There's actually a test case of this available in the Amiga world. As I've already mentioned, the A2090 controller uses a FIFO and DMA to complete it's transfer, and achieves about 625K Bytes/Second. There's a new SCSI controller out there, from a company called Great Valley Peripherals, that uses an I/O chip DMA to shared RAM (4K of static RAM on-board, so once you're in sync I suspect there will rarely be a collision between the CPU and the peripheral chip). I don't have any benchmarks on this new board, but I guarantee it'll be slower. > Steve Rice -- Dave Haynie "The B2000 Guy" Commodore-Amiga "The Crew That Never Rests" {ihnp4|uunet|rutgers}!cbmvax!daveh PLINK: D-DAVE H BIX: hazy "I can't relax, 'cause I'm a Boinger!"