Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!ll-xn!cit-vax!amdahl!nsc!curry From: curry@nsc.UUCP (Ray Curry) Newsgroups: net.arch Subject: re:Floating point performance Message-ID: <3833@nsc.UUCP> Date: Wed, 8-Oct-86 16:53:37 EDT Article-I.D.: nsc.3833 Posted: Wed Oct 8 16:53:37 1986 Date-Received: Thu, 9-Oct-86 03:33:33 EDT Reply-To: curry@nsc.UUCP (Ray Curry) Followup-To: <340@euroies.UUCP> Distribution: net Organization: National Semiconductor, Sunnyvale Lines: 50 >Path: nsc!pyramid!decwrl!decvax!ucbvax!ucbcad!nike!lll-crg!seismo!mcvax!euroies!shepherd >From: shepherd@euroies.UUCP (Roger Shepherd) >Newsgroups: net.arch >Subject: Floating point performance >Message-ID: <340@euroies.UUCP> >dislikes the NS 32310 (four chips); they seem to give the >same MFLOP rating. (Does anyone have Whetstone figures for >these two?) >Comparisons against Weiteks (or whatever) are also somewhat >suspect. To use their peak data rate you have to use them in >pipelined mode, their scalar mode tends to be somewhat slower -- >Roger Shepherd >INMOS Limited, 1000 Aztec West, Almondsbury, Bristol, BS12 4SQ, GB >USENET: ...!euroies!shepherd >PHONE: +44 454 616616 Just by coincidence, I have been running some floating point benchmarks on NS32081 floating point processor and thought I needed to respond with some more up to date numbers. I ran the single precision Whetstone on the NS32032 and NS32081 at 10MHz on the DB32000 board, and the NS32332 and NS32081 at 15 MHz on the DB332 board. I don't know where the posted 32032-32081 number came from but I measure better even using our older compiliers. Our new compilers show marked improvement. 32032-32081 (10MHz) 189 Kwhets (old compiler) 32032-32081 (10MHz) 390 Kwhets (new compiler) 32332-32081 (15MHz) 728 Kwhets (new compiler) I used the 32332-32081 numbers to generate instruction counts to project worst case performance for the NS32310 and the NS32381, worst case being using the identical math routines and minimizing the pipelining of the 32310. These project performance for the 32332-32381 (15MHz) at approx- imately 1100-1200 KWhets and 32332-32310 (15MHz) at 1500-1600 KWhets. Since both the 32310 and 32381 will have new instructions that will impact the math libraries, the real performance could be higher. Just for interest, preliminary analysis is saying pipelining should improve performance at least 15% overall (30% for the floating point portion of the instruction mix). I would like to add my own question to the value of benchmarks. That is what do the people on the net feel about transcendental functions? The Whetstone seems to me to place more emphasis on them than real life. One of the reasons for not including them directly in the 32081 was that it was felt that implementing them in math routines instead of hardware was more cost effective. Is this true or are transcendentals important enough for the increased cost of implementing them in hardware?