Path: utzoo!utgpu!water!watmath!clyde!att!osu-cis!tut.cis.ohio-state.edu!husc6!yale!mfci!colwell
From: colwell@mfci.UUCP (Robert Colwell)
Newsgroups: comp.arch
Subject: Re: memory system design
Message-ID: <483@m3.mfci.UUCP>
Date: 28 Jul 88 10:48:46 GMT
References: <5342@june.cs.washington.edu> <76700040@p.cs.uiuc.edu>
Sender: root@mfci.UUCP
Reply-To: colwell@mfci.UUCP (Robert Colwell)
Organization: Multiflow Computer Inc., Branford Ct. 06405
Lines: 54

In article <76700040@p.cs.uiuc.edu> gillies@p.cs.uiuc.edu writes:
>
>In my undergrad systems course we learned to optimize a multi-level
>memory design for speed, given a constant number of $$$.  We used:
>
>1.  Paper & pencil
>2.  Simple model of cacheing/paging hit ratio versus cache size (often
>    some given linear function  hitsRate = f(memorySize)).
>3.  The price/performance of various kinds of memory, for each kind:
>    a.  $/K
>    b.  access time
>
>For a 2-level memory system (main memory, cache), you could plot a
>2-dimensional curve (main memory size versus cache size), then derive
>the highest performance point on the curve.
>
>Of course, this analysis is impossible if you don't know your
>instruction mix and software paging patterns.  And if the customer
>wants to expand main memory, he should probably expand the cache at
>the same time (I think this is uncommon).  So I doubt many companies
>pay attention to this analysis -- maybe it's mostly academic.
>
>My point is that it's an optimization problem, which if
>oversimplified, can even be handled with paper & pencil.  If not, then
>it can probably be solved by nonlinear optimization methods.

You can solve anything if you oversimplify it enough.  I have my
doubts as to whether you'd EVER get anything useful out of this
approach in the real world.  You have to 

  1) assume you know who your users will be
  2) assume their various workloads (Ha!  No Way!)
  3) assume things about how your compiler will improve (or change)
  4) assume various things about what RAMs will be available in the
     time frame you want (static, dynamic, speed, cycle time, setup
     and holds, pinouts, package sizes, price, power, and
     availability)
  5) assume things about how you're going to cool this pile of chips

And if you get any of these wrong you've lost the game in a big way.
Usually the best you can do along these lines is apply your suggested
analysis to the machine you're currently shipping, take some
workloads that you think are representative, and see how close to
optimal you got there.  But it's no easy trick, and you're still left
with wondering how to extrapolate the answer to your next machine.

Perhaps the Cray-2 is worth pondering in this regard...57-cycle
latency, is it?


Bob Colwell            mfci!colwell@uunet.uucp
Multiflow Computer
175 N. Main St.
Branford, CT 06405     203-488-6090