Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!cs.utexas.edu!uunet!portal!cup.portal.com!mmm
From: mmm@cup.portal.com (Mark Robert Thorson)
Newsgroups: comp.arch
Subject: Re: flexible caches
Message-ID: <22211@cup.portal.com>
Date: 16 Sep 89 18:51:26 GMT
References: <224@qusunr.queensu.CA> <22151@cup.portal.com>
  <2115@munnari.oz.au>
Organization: The Portal System (TM)
Lines: 47

ok@cs.mu.oz.au (Richard O'Keefe) says:

> This is a joke, right?  I mean, neural nets typically require THOUSANDS
> of training runs to learn even the simplest things.  In this case, each
> training run corresponds to a complete execution of the program.  If you
> don't believe that neural nets learn slowly, get a copy of PDP vol 3 and
> play with the programs in the enclosed floppies.  ["slowly" in the sense
> of requiring many trials.  Obviously those programs could be a lot faster.]

The speed of learning depends on the model being used.  Would you have
agreed with my statement if I had used the term "advanced statistical
techniques" rather than the emotion-laden "neural networks"?  It seems many
people are resistant to use of neural networks whenever they can see 
another way to solve the problem, much like the situation when micro-
programming was first introduced.  (Many people continued to prefer
hardwired implementations.)

Also, note that many programs provide the opportunity to gather statistics
over thousands of runs.  A display list interpreter for computer animation
might get called 60 times a second.  1000 runs would take less than 17 seconds.

I have received criticism by e-mail that cache prediction is a purely
static problem, to be solved entirely by compiler cleverness.  I'm not
sure I can agree with this.  The compiler can see the instruction space,
but it has no way to anticipate the kind of data which might be thrown
at it.  Nor can it anticipate how often the different modes of the program
might get exercised.

For example, if the program maintains some sort of tree data structure,
the order in which data arrives could affect how many times that tree
needs to be restructured.  This could in turn have consequences for the
cache.  Normally, you might want to cache the tree for speed, but under
conditions of frequent restructuring you might want the tree to be
uncached so that more important stuff won't be pushed out of the cache.
In this case, a statistical algorithm which incorporates a learning model
could anticipate what sort of data is going to come in, and make preparations
for the most efficient way to handle it.

As another example, a CAD program might have several computation-intensive
functions which are called in response to user actions.  Depending upon
which functions are frequently called, the effectiveness of caching different
parts of the memory space could vary widely.  If the user is doing something
which causes frequent screen updates, caching the display list might be
appropriate.  If the user is performing many simulation runs, caching the
description of the model might be the right thing to do.
In this case, a learning model would recognize what kind of work the user
is doing and adjust its predictions of his memory usage accordingly.