Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!samsung!zaphod.mps.ohio-state.edu!swrinde!elroy.jpl.nasa.gov!sdd.hp.com!hplabs!hpfcso!maf From: maf@hpfcso.FC.HP.COM (Mark Forsyth) Newsgroups: comp.arch Subject: Re: cache pre-load/no-load instructions Message-ID: <8840020@hpfcso.FC.HP.COM> Date: 21 Mar 91 21:44:05 GMT References: Organization: Hewlett-Packard, Fort Collins, CO, USA Lines: 48 >From: jonathan@cs.pitt.edu (Jonathan Eunice) >Message-ID: >Two of the tweaks of the forthcoming "Snake" (HP-PA 1.1) systems from These were presented as extensions to the PA-RISC architecture, which is used in several product lines, NOT features of any particular products. >HP are: > >1) cache pre-load instructions (the compiler inserts these into the > >2) cache no-load hints as a part of store instructions (useful to avoid > >How effective are these optimizations likely to be? Extremely effective at eliminating cache-miss bottlenecks in certain intended cases. 1) is employed for applications which access very large uniform data sets and perform a fair amount of manipulation or calculations on individual data items (allowing enough time to prefetch the next group of operands). In some cases, cache miss penalties can be COMPLETELY elim- inated from the performance equation. 2) is intended to be used primarily by the OS for page initialization, block moves, etc. >(While they aren't going >to give the same kind of speedup as making the system super-scalar or >super-pipelined, they strike me as effective tweaks.) Comparing these features to pipeline implementations is apples to oranges. Cache hints address classes of applications which are dominated by memory system performance, whereas, superscalar pipelines improve primarily certain floating point applications dominated by the pipeline CPI. In applications dominated by cache misses these give far bigger performance improvements than a superscalar pipeline would. > >Does anyone else have them? I seem to recall a posting to the effect that >the RS/6000 POWER architecture does not. What about MIPS, SPARC, etc? Is >this a me-too feature? The features were defined as a result of extensive analysis of bottlenecks in important customer applications, not imitation. I'm not aware of any similar features in other architectures. --- Mark Forsyth Hewlett-Packard maf@hpesmaf.fc.hp.com Engineering Systems Laboratory Fort Collins, Colorado