Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!utgpu!water!watnot!watmath!clyde!cbatt!ihnp4!houxm!mtuxo!mtune!codas!peora!pesnta!valid!markp
From: markp@valid.UUCP
Newsgroups: comp.arch
Subject: Re: Will caches ever become obsolete?
Message-ID: <1044@valid.UUCP>
Date: Wed, 4-Mar-87 20:33:07 EST
Article-I.D.: valid.1044
Posted: Wed Mar  4 20:33:07 1987
Date-Received: Sat, 7-Mar-87 00:42:56 EST
References: <3182@wateng.UUCP>
Distribution: comp
Organization: Valid Logic, San Jose, CA
Lines: 103

> 
>       I have a question concerning caches.  Currently,
> caches are used, either in uniprocessors, or multiprocessors,
> because the time a microprocessor takes to execute an
> instruction is much smaller than the time to reference
> main memory.  Thus, a cache is used to match the speed
> difference (the cache is usually as fast as the microprocessor).

More precisely, a cache is used to match the throughput of the memory
system to the memory bandwidth of the processor.  Instruction boundaries
are inconsequential, except that RISC machines tend to have a memory
reference to instruction ratio of very close to 1 (actually 1.2-1.4, the
instruction reference itself plus .2-.4 for load/stores, more or less).

> 
>       Thus, it is evident that a cache can very dramatically
> improve the throughput of a microprocessor.  For multiprocessors,
> much research is being conducted to find efficient algorithms
> for multi-cache consistency.
> 

No comment. :-)

>       Multi-cache consistency problem arises when, say a
> block of data resides in the caches of processors A, B, and
> C.  Then, if processor B decides to write to that block,
> it must inform A and C that their copies of the block are
> no longer valid.
> 

Or update the copies in the caches of A and C (broadcast write).

>      My question is, then, with the current improvements in
> memory chips (ie. faster access, and greater densities), does
> anyone forsee a time in the distant future (> 3 or 4 years)
> that the speed of say, a 1Mb chip will be comparable to that
> of say a 1Kb ECL chip used in current caches?
> 

Sure.  But you're comparing apples and oranges, and it doesn't make
sense to compare the speeds of tomorrow's memory chips with today's
processors.  Look at it this way-- CPU's and memory chips use basically
the same fab processes.  It usually happens that memory chips serve as
the testbeds for the new processes (i.e. megabit DRAM's for 1u CMOS)
and CPU's follow, but the CPU's ALWAYS FOLLOW!  Therefore your comparison
is irrelevant, since the memory bandwidth required by next-generation
processors will still far exceed the bandwidth of next-generation memories.
In fact, the new processors will cycle so fast that it will be impractical
to go off-chip for cache, and we will see instruction and data caches
integrated onto the CPU chip (just like the 68030, but of truly useful
size).  Furthermore, multi-level caches will become more important, as the
on-chip cache(s) may not provide a good enough hit rate to allow going to
main memory directly, and an on-board cache (say a few megabytes or so)
will be used to further reduce the average time required to satisfy a memory
reference.  Main memory, consisting of 16Mb chips or even denser, will still
serve as a last resort (not counting the disk, of course).

>      In other words, will all the research being conducted
> for the cache coherency problems be a waste?  Could the
> research done for multi-cache coherency be applied elsewhere?
> 
> Thanks.
> 
>              Hemi Thaker
> 

Ooh, my blood is boiling now!  Especially since I spent 2 years working
on multiprocessor cache coherence algorithms for my MSEE, and have then spent
the last 2+ years helping the P896 Futurebus committee to define a standard
set of facilities to implement various multi-cache consistency protocols
(of varying complexity and performance).  The answer to your questions is
"no" on both counts, by the way, and it is likely that coherence solutions
partially enforced by software will become more and more important.
In other words, the on-chip MMU may contain bits to designate pages
"potentially sharable," enabling consistency to be enforced on those pages
only.  Otherwise, communication between processor modules, even at >100MB/sec,
is insufficient to support the invalidation and/or update traffic,
particularly in systems with many (i.e. >10) processors.  Even though I can
imagine a system that uses a 2GB/sec fiber-optic bus to connect its GaAs
processors to a bank of memory boards based on 256Mb CMOS memories, a
hierarchy of communication bandwidths will still exist, and caches will
still be necessary.  Also, don't forget that the speed of light will become
very important in microprocessor-based memory hierarchies in the 1990's, and
this will form yet another driving force behind the effective use of cache
memories.

Now of course, if you are willing to argue that cache consistency
enforcement can be done completely in software while providing an acceptable
programming model (i.e. reasonably transparent) for parallel programming
and efficient load balancing of processes across multiple processors, then go
ahead and prove it to me by example.  But you still need caches on your
processors for the same reasons as I detailed above, and there still needs
to be research done in more efficient software-enforced cache coherence
schemes.

In other words, the field is still WIDE open, but future research should
proceed thinking about 1990's technology, not 1980's technology.

Flame off.  Phew.

	Mark Papamarcos
	Valid Logic [and P896 Futurebus working group/cache task group]
	{ihnp4,hplabs}!pesnta!valid!markp