Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!samsung!zaphod.mps.ohio-state.edu!swrinde!elroy.jpl.nasa.gov!sdd.hp.com!hplabs!hpfcso!maf
From: maf@hpfcso.FC.HP.COM (Mark Forsyth)
Newsgroups: comp.arch
Subject: Re: cache pre-load/no-load instructions
Message-ID: <8840020@hpfcso.FC.HP.COM>
Date: 21 Mar 91 21:44:05 GMT
References: <JONATHAN.91Mar17034438@speedy.cs.pitt.edu>
Organization: Hewlett-Packard, Fort Collins, CO, USA
Lines: 48


>From: jonathan@cs.pitt.edu (Jonathan Eunice)
>Message-ID: <JONATHAN.91Mar17034438@speedy.cs.pitt.edu>
>Two of the tweaks of the forthcoming "Snake" (HP-PA 1.1) systems from 

These were presented as extensions to the PA-RISC architecture, which is
used in several product lines, NOT features of any particular products. 

>HP are:
>
>1)  cache pre-load instructions (the compiler inserts these into the
>
>2) cache no-load hints as a part of store instructions (useful to avoid
>
>How effective are these optimizations likely to be?  

Extremely effective at eliminating cache-miss bottlenecks in certain
intended cases. 1) is employed for applications which access very large
uniform data sets and perform a fair amount of manipulation or calculations
on individual data items (allowing enough time to prefetch the next group
of operands). In some cases, cache miss penalties can be COMPLETELY elim-
inated from the performance equation. 2) is intended to be used primarily
by the OS for page initialization, block moves, etc.  

>(While they aren't going
>to give the same kind of speedup as making the system super-scalar or 
>super-pipelined, they strike me as effective tweaks.)  

Comparing these features to pipeline implementations is apples to oranges.
Cache hints address classes of applications which are dominated by memory
system performance, whereas, superscalar pipelines improve primarily 
certain floating point applications dominated by the pipeline CPI. In
applications dominated by cache misses these give far bigger performance
improvements than a superscalar pipeline would. 

>
>Does anyone else have them?  I seem to recall a posting to the effect that
>the RS/6000 POWER architecture does not.  What about MIPS, SPARC, etc?  Is
>this a me-too feature?

The features were defined as a result of extensive analysis of bottlenecks
in important customer applications, not imitation. I'm not aware of any
similar features in other architectures.

---
Mark Forsyth                        Hewlett-Packard
maf@hpesmaf.fc.hp.com               Engineering Systems Laboratory
                                    Fort Collins, Colorado