Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!ncar!hsdndev!rice!ariel.rice.edu!preston
From: preston@ariel.rice.edu (Preston Briggs)
Newsgroups: comp.arch
Subject: Re: Snake
Message-ID: <1991Mar27.175814.1450@rice.edu>
Date: 27 Mar 91 17:58:14 GMT
References: <69465@brunix.UUCP> <32580006@hpcuhe.cup.hp.com>
Sender: news@rice.edu (News)
Organization: Rice University, Houston
Lines: 43

linley@hpcuhe.cup.hp.com (Linley Gwennap) writes:

[good info about the new HP chips]

>eight "shadow" registers were added to provide quick context switching  for
>the TLB miss handler.

I'm not sure I understand.  Could you expand slightly?

>conditional branches are
>executed with no delay if their outcome is predicted  correctly,  and  with
>only  a  single  cycle penalty otherwise.  The branch prediction algorithm,
>more advanced than America's, predicts forward branches to be  untaken  and
>backward  branches  taken, thus optimizing for loops.

The RS/6000 can rearrange loops so that there are no branch
delays (often with no branch cost at all).  That's hard to beat.
What happens with a fall-through and a forward branch?

>in fact, the ratio of Integer  SPECmarks  to  MHz  for  
>Snakes (65/66) actually exceeds America's (35/42).

Could you post results for individual SPEC programs (both int and float)?

>The external caches are direct mapped and are protected by  parity,  making
>them  slightly less robust than America's ECC cache. 

I would have liked some set-associativity too.  (I'm very greedy)

>will  begin  processing  as soon as the critical word is obtained, reducing
>the miss penalty by as much  as  seven  cycles.

What're the best and worst-case D-cache miss times (say, without writeback)?
Line length?  Will a cache-miss freeze the CPU or just lock the target
register?

>The I- and D-TLBs are fully associative

Hooray!

Thanks for the information.  Thanks also for the references.

Preston Briggs