Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!cs.utexas.edu!uunet!portal!cup.portal.com!mmm From: mmm@cup.portal.com (Mark Robert Thorson) Newsgroups: comp.arch Subject: Re: flexible caches Message-ID: <22211@cup.portal.com> Date: 16 Sep 89 18:51:26 GMT References: <224@qusunr.queensu.CA> <22151@cup.portal.com> <2115@munnari.oz.au> Organization: The Portal System (TM) Lines: 47 ok@cs.mu.oz.au (Richard O'Keefe) says: > This is a joke, right? I mean, neural nets typically require THOUSANDS > of training runs to learn even the simplest things. In this case, each > training run corresponds to a complete execution of the program. If you > don't believe that neural nets learn slowly, get a copy of PDP vol 3 and > play with the programs in the enclosed floppies. ["slowly" in the sense > of requiring many trials. Obviously those programs could be a lot faster.] The speed of learning depends on the model being used. Would you have agreed with my statement if I had used the term "advanced statistical techniques" rather than the emotion-laden "neural networks"? It seems many people are resistant to use of neural networks whenever they can see another way to solve the problem, much like the situation when micro- programming was first introduced. (Many people continued to prefer hardwired implementations.) Also, note that many programs provide the opportunity to gather statistics over thousands of runs. A display list interpreter for computer animation might get called 60 times a second. 1000 runs would take less than 17 seconds. I have received criticism by e-mail that cache prediction is a purely static problem, to be solved entirely by compiler cleverness. I'm not sure I can agree with this. The compiler can see the instruction space, but it has no way to anticipate the kind of data which might be thrown at it. Nor can it anticipate how often the different modes of the program might get exercised. For example, if the program maintains some sort of tree data structure, the order in which data arrives could affect how many times that tree needs to be restructured. This could in turn have consequences for the cache. Normally, you might want to cache the tree for speed, but under conditions of frequent restructuring you might want the tree to be uncached so that more important stuff won't be pushed out of the cache. In this case, a statistical algorithm which incorporates a learning model could anticipate what sort of data is going to come in, and make preparations for the most efficient way to handle it. As another example, a CAD program might have several computation-intensive functions which are called in response to user actions. Depending upon which functions are frequently called, the effectiveness of caching different parts of the memory space could vary widely. If the user is doing something which causes frequent screen updates, caching the display list might be appropriate. If the user is performing many simulation runs, caching the description of the model might be the right thing to do. In this case, a learning model would recognize what kind of work the user is doing and adjust its predictions of his memory usage accordingly.