Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!wuarchive!ukma!widener!dsinc!bagate!cbmvax!daveh
From: daveh@cbmvax.commodore.com (Dave Haynie)
Newsgroups: comp.sys.amiga.advocacy
Subject: Re: (Video) Hardware Idiots ?
Message-ID: <22392@cbmvax.commodore.com>
Date: 13 Jun 91 05:16:58 GMT
References: <1991Jun10.065629.21255@marlin.jcu.edu.au> <1991Jun10.074421.6782@mintaka.lcs.mit.edu> <22368@cbmvax.commodore.com> <1991Jun12.232718.2373@mintaka.lcs.mit.edu>
Reply-To: daveh@cbmvax.commodore.com (Dave Haynie)
Organization: Commodore, West Chester, PA
Lines: 73

In article <1991Jun12.232718.2373@mintaka.lcs.mit.edu> rjc@geech.gnu.ai.mit.edu (Ray Cromwell) writes:
>In article <22368@cbmvax.commodore.com> daveh@cbmvax.commodore.com (Dave Haynie) writes:

>  So the max transfer rate, with 0 wait states is 10mb/sec (theoretical) on
>the fast ram bus at @25mhz. 

Well, that's to fast RAM, which actually does have wait states.  You can't run
zero wait states on a 25MHz 68030 using DRAM, at least not without extra magic.

>Can you explain how burst works? 

Burst is one kind of extra magic.  The 68030 only supports burst reads (the
'040, A3000 Fast RAM, and Zorro III can support burst reads and writes, though
Zorro III burst is considerably different than 680x0 burst).  The burst 
protocol is actually a cache prefetch: the 68030 can only use one new longword
every 2 clock cycles, but cache can store one longword every clock cycle.  
That's how the 68030 looks at it: you can load four longwords in as few as 5 
clocks, rather than as few as 8.

The A3000, and most memory systems, look at it from a slightly different
prespective.  All common modern dynamic memory chips are address in row/column
fashion via a multiplexed address bus.  DRAM designers figured out long ago
that it might be useful to remember the row address across multiple column
addresses, thus running memory cycles dependent only on column cycle time
(typically around 1/3rd that of the cycle row cycle time).  Using Static Column
DRAMs, the A3000 can run faster cycle.  But it needs some way to determine that
cycle N+1 is on the same "page" (eg, same row address) as cycle N was.  Burst
is a perfect case, since the 68030 indicates to the memory system that a burst
cycle is taking place.  So the A3000 Fast RAM transfers four longwords in 11
clocks, rather than the basic 20.  There's actually a mode defined in RAMSEY
that uses additional magic to cut that down to 9 clocks in some case, but it
doesn't work at present (it might possibly be fixed at some future date, 
though there is some question about it's utility -- like most magic memory
tricks, it can lose as well as win, depending on the program activity).

>I remember reading about it a long time ago, but can't recall it, something 
>about eliminating wait states on sequential ram accesses?

Yeah, it's four longwords within a quadlongword-aligned quadlongword.  The
burst starts on the longword that the 68030 is actually after, and loads the
next three, wrapping around if that first longword isn't at the start of 
the quadlongword boundary.

>  IMHO, it looks like the blitter wins in the majority of situations like
>large blits, masking and shifting, arbitrary bit boundaries and complex
>logic operations. [feel free to correct me on this Dave]. 

I suspect it will, too, in many cases.  Lots of it depends on what's being
done.  Anything that involves a significant amount of reading and writing
to Chip RAM, like image processing, can often be better done by the CPU if
you do the rendering in Fast RAM and transfer it to Chip when done.  Real short,
simple, and regular operations in Chip RAM can be done faster by the CPU too,
since there is a blitter setup time involved.  Large operations with lots of
bit alignment differences will probably still go faster when done by the
blitter.

>I recently wrote a line plotter on the Commodore 64 that cached the current 
>bitmap byte, it sped the plot tremendously.

I'm not that familiar with the line draw stuff, but I don't think it's too
stupid.  On the C64, you have the additional problem of a card oriented 
bitmap to deal with; things are much more straightforward on a regular
bitmap.

>  Yep. I love the Amiga's DMA philosophy, even the hardware multitasked.
>I think  Polled I/O should go the way 8bit machines.

No arguments here.  Interrupts and DMA are indeed the proper way.

-- 
Dave Haynie Commodore-Amiga (Amiga 3000) "The Crew That Never Rests"
   {uunet|pyramid|rutgers}!cbmvax!daveh      PLINK: hazy     BIX: hazy
	"This is my mistake.  Let me make it good." -R.E.M.