Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!mips!pacbell.com!att!ucbvax!mtxinu!taniwha!paul
From: paul@taniwha.UUCP (Paul Campbell)
Newsgroups: comp.arch
Subject: Re: Dynamic Display Architecture
Message-ID: <821@taniwha.UUCP>
Date: 17 Apr 91 19:24:01 GMT
References: <1991Apr15.200955.3438@waikato.ac.nz> <3340@crdos1.crd.ge.COM> <20670@cbmvax.commodore.com> <1991Apr17.051746.15592@sbcs.sunysb.edu>
Reply-To: paul@taniwha.UUCP (Paul Campbell)
Organization: Taniwha Systems Design, Oakland
Lines: 47

In article <1991Apr17.051746.15592@sbcs.sunysb.edu> jallen@libserv1.ic.sunysb.edu (Joseph Allen) writes:
>I had an idea for a hardware windowing circuit once which was very simple and
>which would eliminate most of the problems windows have today.  All you do is
>break up the screen into small (maybe character sized) blocks.  Then for each
>block you have a pointer to where in memory the actual data is.  It's really as
>simple as normal character refresh memory,  but with no font chip and with
>wider (32-bits instead of 8) refresh memory.

Of course this isn't a new idea, let's look at why it's hard ....

Let's assume you are using VRAMs (for performance) and you have an 8-bit
display, since the VRAM's max clock frequency is ~40MHz (25nS) in order
to get the 100MHz pixel rate you need for a 1M pixel display @ 75Hz you
need a 4:1 interleave on the video side, this means you have a 32-bit
(4x8-bit pixels) data path, OK, so far so good.

You are clocking your (4) pixels @100MHz/4 = 25MHz = 40nS/pixel, to switch to
a new chunk you need to do the VRAM read transfer to change the memory
address, this cycle takes (minimum) 180nS, also assume that you want to be
able to do at least one framestore access (or refresh etc etc) from the host
at the same rate (otherwise your rendering will be TERRIBLE!) then you are
going to have to leave another 180ns available each cycle (remember the read
transfer cycles are 'real-time' so you have to schedule all other cycles in
between). This means that you can only do a read transfer every 
4*(180+180)/40 = 36 pixels - assuming you want to do this on power of two
boundaries you have to limit the width of your chunks to 64 pixels wide. Of
course you also have to get information on where the next pixel will start,
if you fetch it from the same framestore then you get 4*(180+180+180)/40 = 54
pixels (still within your 64-pixels) - but you graphics performance (rendering
rate) just went down again, this time to 1/3.

Of course if you are using 1-bit pixels then the numbers are much different
(and more practical) - but not scalable. 

This is not to say you can't build such a system - lots of expensive SRAM
and a big fan initially come to mind and there are other trickier ways to
do it - all of them require throwing lots of expensive silicon at the problem.

Oh BTW, I know the guy who got the patent on your idea :-)


	Paul
-- 
Paul Campbell    UUCP: ..!mtxinu!taniwha!paul     AppleLink: CAMPBELL.P

"But don't we all deserve.
 More than a kinder and gentler fuck" - Two Nice Girls, "For the Inauguration"