Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!swrinde!cs.utexas.edu!asuvax!ncar!gatech!hubcap!mark From: mark@hubcap.clemson.edu (Mark Smotherman) Newsgroups: comp.arch Subject: Re: SPARC implementation or architecture Message-ID: <1991Apr18.162205.20529@hubcap.clemson.edu> Date: 18 Apr 91 16:22:05 GMT References: <1991Apr17.183822.7681@elroy.jpl.nasa.gov> Organization: Clemson University Lines: 51 From article <1991Apr17.183822.7681@elroy.jpl.nasa.gov>, by david@elroy.jpl.nasa.gov (David Robinson): > Has anyone compared why SPARC tends to run slower at the same clock > speed as other RISC chips? As Michael Slater points out in the most recent Microprocessor Report (p. 12, vol. 5, no. 6, April 3, 1991), the current SPARC implementations exhibit lower SPECx/MHz in part because they use a unified I/D cache. The competing implementations from MIPS, HP, and IBM have split caches. Also, the early SPARCstations used only a single 4-byte write buffer. The SS2 design seems to address the write buffer problem but not the unified cache. (Maybe the SPEC configuration parameters should include #write buffers and presence or absence of a cache refill buffer and store back buffer.) One possible explanation to the less aggressive memory system design seen in SPARC implementations is a reliance on register windows for performance. John Hennessy in the Oct. 1989 IEEE video seminar on RISC processor design noted that the register window approach was thought to substantially lower the load/store traffic (for integers) and could therefore tolerate simplified (i.e., slower) caches. However, Hennessy also noted that SPARC register windows do not help FP load/stores. An interesting architectural comparison between SPARC and MIPS was given by Sun folks at ASPLOS-IV: R.F. Cmelik, et al., "An analysis of MIPS and SPARC instruction set utilization on the SPEC benchmarks," pp. 290-302. (They concluded that SPARC had the advantage, but the MIPS folks were quick to point out that they used current Sun compilers and year-old MIPS compilers. It will be interesting to see how we chew over this paper in comp.arch!) The data presented in this paper showed the following MIPS/SPARC ratios for memory traffic: int loads (in int benchmarks) 1.07 int stores 1.00 FP loads (in FP benchmarks) 1.92 (i.e. MIPS did twice as many) FP stores 2.49 They suggested that the integer ratios do not show the true value of register windows since the dynamic procedure calling frequency of the SPEC benchmarks is abnormally low (see p. 293). They also noted that MIPS-I lacks dbl. prec. FP load/store but claimed that even disallowing DP FP l/s in SPARC would _not_ significantly reduce the memory traffic ratios; they attributed the large FP ratios to compiler technology. MIPS has repeated the experiment with current compilers. Let's ask John Mashey to post the new numbers and new ratios or to publish a follow-up article in ACM Computer Architecture Newsletter. -- Mark Smotherman, Comp. Sci. Dept., Clemson University, Clemson, SC 29634 INTERNET: mark@hubcap.clemson.edu UUCP: gatech!hubcap!mark