Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!ames!apple!usc!cs.utexas.edu!uunet!dg!rec
From: rec@dg.dg.com (Robert Cousins)
Newsgroups: comp.arch
Subject: Re: Caches
Message-ID: <195@dg.dg.com>
Date: 26 Jun 89 21:41:25 GMT
References: <799@acorn.co.uk> <95@altos86.Altos.COM>
Reply-To: rec@dg.UUCP (Robert Cousins)
Organization: Data General, Westboro, MA.
Lines: 65

In article <95@altos86.Altos.COM> dtynan@altos86.Altos.COM (Dermot Tynan) writes:
>In article <799@acorn.co.uk>, SFurber@acorn.co.uk writes:
>> At first sight a write buffer looks a lot simpler to build than a write-back
>> cache, because of the flushing issues involved in context switching or
>> paging with the latter. However Jouppi (Proceedings of 16th International
>> Symposium on Computer Architecture, p. 287) states that for similar
>> performance "A write-back cache is a simpler design...".
>> Steve Furber (sfurber@acorn.uucp)
>That's nice.  How about a little proof, for those of us who don't happen to
>have the proceedings near at hand...  What did he base his arguments on?
>							- Der

I can offer a hand-waving argument for this simply:  The logic in a write-back
cache must handle a few cases:

1	write with dirty writeback
2	write without writeback
3	read with miss and dirty writeback
4	read with miss but no writeback
5	read with hit

A write through cache must handle a more limited number of cases:

6	write with update to cache [write to resident line]
7	write with invalidate [write to non-resident line] 
8	read with miss
9	read with hit
	(note:  the two write cases may be considered identical in many
	implementations.)

At this point, a write through cache seems simpler.  However, if the
line size is greater than a single word, the number of states increases
substantially.  Specifically, the case in (7) becomes:

	cache line read,
	word write to cache, (may take place simultaneously with below)
	word write to memory

which is substantially more complex and is a subset of the operations
required for a write back cache in the similar situation.  In fact,
when viewed in more detail, the line length issue makes the complexity
approximately equal.

A second point to ponder is that caches can be viewed as two level
beasts:  a CPU interface and a bus interface.  The CPU interface is
responsible for handling CPU requests, judging if a hit has occured
and responding to similar things.  The bus interface listens to the
CPU interface and waits to be told to fetch a new line.  The fetch
operation involves writing the old line to memory if dirty and fetching
the new line.  This fetch operation will take place for all misses --
read or write.  

Compare this with the writethrough approach:  the CPU interface appears
to be about the same, but the bus interface has greater complexity since
it must handle not only the line fetch, but also buffered writes.  Potentially
there can be multiple outstanding writes.  

While I tend to view caches as a necessary evil, a truly optimized caching
scheme is a complex beast.  

Robert Cousins
Dept. Mgr, Workstation Dev't.
Data General Corp.

Speaking for myself alone.