Path: utzoo!mnetor!uunet!husc6!mit-eddie!ll-xn!ames!amdcad!tim
From: tim@amdcad.AMD.COM (Tim Olson)
Newsgroups: comp.arch
Subject: Re: target caching
Message-ID: <20482@amdcad.AMD.COM>
Date: 20 Feb 88 19:19:34 GMT
References: <910@PT.CS.CMU.EDU>
Reply-To: tim@amdcad.UUCP (Tim Olson)
Organization: Advanced Micro Devices
Lines: 46
Keywords: page mode

In article <910@PT.CS.CMU.EDU> lindsay@K.GP.CS.CMU.EDU (Donald Lindsay) writes:
| The TF-1 people at IBM intend to use an interesting trick to simplify their
| CPU.
| 
| DRAMs can be purchased that have "page mode" - that is, you can access the
| next-address value much more quickly than a randomly addressed value.  This
| is because each random access can leave a large number of bits in a long
| register (say, 1024 bits, in the case of a 1Mb RAM). A page-mode access just
| shifts the register. 

This sounds more like Video-DRAM (VRAM) to me.  VRAMS have a
static-column shifter that can shift out the next sequential bit every
cycle without any subsequent addresses, where as page-mode or
static-column mode must be supplied a partial address for every access.

| When the CPU decides to branch, of course, there's trouble. They solve this
| by keeping a cache of the instruction streams at 32 recent branch targets.
| If the target PC hits, then they fetch instructions from the cached stream,
| until the RAMs have done their random access, and are ready to page-mode
| again.

Wow!  Either there is serendipity involved here, or the TF-1 architects
closely studied the Am29000 Manual -- this is the exact method we use to
keep the pipeline fed during branches -- even the number of entries is
the same!

| I haven't studied the recent RAM offerings well enough to count the cycles,
| and critique the speed expectations. I guess it sounds fine, and it does
| sound simple. But, there's a major catch: it's a Harvard architecture.  The
| memory is code-only, so that grubby data won't spoil the code's pipelined
| perfection.
|
| I know that some recent RAM chips are dual-ported, supposedly so that a
| processor can write image data through the random port, while a graphics
| screen is being refreshed through the page-mode port. Would these chips
| allow the TF-1 trick to work in non-Harvard designs ? 

That's exactly what a VRAM does.  It has effectively two ports: the
random access port (for loads/stores and branch addresses), and the
serial port (for sequential instruction fetches).  This allows a
Harvard-architecture machine to have separate buses for performance,
while maintaining a shared instruction/data memory.

	-- Tim Olson
	Advanced Micro Devices
	(tim@amdcad.amd.com)