Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!wuarchive!ukma!widener!dsinc!bagate!cbmvax!daveh From: daveh@cbmvax.commodore.com (Dave Haynie) Newsgroups: comp.sys.amiga.advocacy Subject: Re: (Video) Hardware Idiots ? Message-ID: <22392@cbmvax.commodore.com> Date: 13 Jun 91 05:16:58 GMT References: <1991Jun10.065629.21255@marlin.jcu.edu.au> <1991Jun10.074421.6782@mintaka.lcs.mit.edu> <22368@cbmvax.commodore.com> <1991Jun12.232718.2373@mintaka.lcs.mit.edu> Reply-To: daveh@cbmvax.commodore.com (Dave Haynie) Organization: Commodore, West Chester, PA Lines: 73 In article <1991Jun12.232718.2373@mintaka.lcs.mit.edu> rjc@geech.gnu.ai.mit.edu (Ray Cromwell) writes: >In article <22368@cbmvax.commodore.com> daveh@cbmvax.commodore.com (Dave Haynie) writes: > So the max transfer rate, with 0 wait states is 10mb/sec (theoretical) on >the fast ram bus at @25mhz. Well, that's to fast RAM, which actually does have wait states. You can't run zero wait states on a 25MHz 68030 using DRAM, at least not without extra magic. >Can you explain how burst works? Burst is one kind of extra magic. The 68030 only supports burst reads (the '040, A3000 Fast RAM, and Zorro III can support burst reads and writes, though Zorro III burst is considerably different than 680x0 burst). The burst protocol is actually a cache prefetch: the 68030 can only use one new longword every 2 clock cycles, but cache can store one longword every clock cycle. That's how the 68030 looks at it: you can load four longwords in as few as 5 clocks, rather than as few as 8. The A3000, and most memory systems, look at it from a slightly different prespective. All common modern dynamic memory chips are address in row/column fashion via a multiplexed address bus. DRAM designers figured out long ago that it might be useful to remember the row address across multiple column addresses, thus running memory cycles dependent only on column cycle time (typically around 1/3rd that of the cycle row cycle time). Using Static Column DRAMs, the A3000 can run faster cycle. But it needs some way to determine that cycle N+1 is on the same "page" (eg, same row address) as cycle N was. Burst is a perfect case, since the 68030 indicates to the memory system that a burst cycle is taking place. So the A3000 Fast RAM transfers four longwords in 11 clocks, rather than the basic 20. There's actually a mode defined in RAMSEY that uses additional magic to cut that down to 9 clocks in some case, but it doesn't work at present (it might possibly be fixed at some future date, though there is some question about it's utility -- like most magic memory tricks, it can lose as well as win, depending on the program activity). >I remember reading about it a long time ago, but can't recall it, something >about eliminating wait states on sequential ram accesses? Yeah, it's four longwords within a quadlongword-aligned quadlongword. The burst starts on the longword that the 68030 is actually after, and loads the next three, wrapping around if that first longword isn't at the start of the quadlongword boundary. > IMHO, it looks like the blitter wins in the majority of situations like >large blits, masking and shifting, arbitrary bit boundaries and complex >logic operations. [feel free to correct me on this Dave]. I suspect it will, too, in many cases. Lots of it depends on what's being done. Anything that involves a significant amount of reading and writing to Chip RAM, like image processing, can often be better done by the CPU if you do the rendering in Fast RAM and transfer it to Chip when done. Real short, simple, and regular operations in Chip RAM can be done faster by the CPU too, since there is a blitter setup time involved. Large operations with lots of bit alignment differences will probably still go faster when done by the blitter. >I recently wrote a line plotter on the Commodore 64 that cached the current >bitmap byte, it sped the plot tremendously. I'm not that familiar with the line draw stuff, but I don't think it's too stupid. On the C64, you have the additional problem of a card oriented bitmap to deal with; things are much more straightforward on a regular bitmap. > Yep. I love the Amiga's DMA philosophy, even the hardware multitasked. >I think Polled I/O should go the way 8bit machines. No arguments here. Interrupts and DMA are indeed the proper way. -- Dave Haynie Commodore-Amiga (Amiga 3000) "The Crew That Never Rests" {uunet|pyramid|rutgers}!cbmvax!daveh PLINK: hazy BIX: hazy "This is my mistake. Let me make it good." -R.E.M.