Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!rutgers!apple!bbn!bbn.com!slackey
From: slackey@bbn.com (Stan Lackey)
Newsgroups: comp.arch
Subject: Re: Caches
Message-ID: <41770@bbn.COM>
Date: 21 Jun 89 17:26:24 GMT
References: <799@acorn.co.uk> <95@altos86.Altos.COM>
Sender: news@bbn.COM
Reply-To: slackey@BBN.COM (Stan Lackey)
Organization: Bolt Beranek and Newman Inc., Cambridge MA
Lines: 48

>In article <799@acorn.co.uk>, SFurber@acorn.co.uk writes:
>> At first sight a write buffer looks a lot simpler to build than a write-back
>> cache, because of the flushing issues involved in context switching or
>> ...

Part of the implementation difficulty around writeback caches is in
architectures where DMA is supported, which does not go through the
cache.  This would include bus-based systems, where the cpu/cache is
packaged as a bus device, and memory and I/O controllers are also
bus devices.  In this case, DMA cannot simply assume it can read 
memory, as the cache can contain the up-to-date data.

To solve this problem, two mechanisms over and above a writethrough
cache are needed.  One is a bus watcher in the cache that looks for reads
on the bus, and accesses the cache for every bus transaction to see if
it contains hot data.  Second, there must be some way for the cache to
substitute its data in place of the memory, or tell the device to retry
the transaction after the cache has written the hot data back to memory,
or some similar thing.

Now, any cache must be aware of DMA activity and either invalidate or
write the new data into cache when a device writes to memory.  This
mechanism is typically extended to add (1) above.  In fact, some
systems, to reduce cache contention, implement the tag store as
dual-ported (either as a dual-ported RAM, or as two copies of the same
data).  Further, (2) is treated by adding more kinds of bus
transactions, and sequences that the cache must perform.

Whether writeback or writethrough is chosen depends upon application
of the product, preferences of the designers, and lots of times stuff
even less tangible than that.  Like the way previous generations of the
product have been done.

It is not safe to assume that writeback is always faster than
writethrough.  In some cases, when there are large data sets (data set
is more than a couple of times larger than the cache and there is not
much locality of data usage) and data is usually written once (like a
matrix transpose), there is an interaction between writeback and large
cache line that causes it to be slower than writethrough.  This
happens with OS's that clear virtual memory before giving it to a
process, for example.

Looking at the RISC trend, it seems natural to assume that the next
step is to have a writeback cache with no "snooping" (as it has been
called) for either I/O reads OR writes, and solve the problem in
software.

-Stan