Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!usc!julius.cs.uiuc.edu!apple!amdcad!mozart.amd.com!cayman!richard From: richard@cayman.amd.com (Richard Relph) Newsgroups: comp.arch Subject: Re: "Rumours" from BYTE - November 1990 Message-ID: <1990Nov15.165330.24387@mozart.amd.com> Date: 15 Nov 90 16:53:30 GMT References: <1990Nov13.160952.13856@mozart.amd.com> <6619@ethz.UUCP> Sender: usenet@mozart.amd.com (Usenet News) Organization: Advanced Micro Devices, Inc., Austin, Texas Lines: 47 In article <6619@ethz.UUCP> ruehl@ethz.UUCP (Roland Ruehl) writes: >In article <1990Nov13.160952.13856@mozart.amd.com>, richard@cayman.amd.com (Richard Relph) writes: >> In article pcg@cs.aber.ac.uk (Piercarlo Grandi) writes: >> >On page 28: "AMD accelerates RISC line with FPU" >> >------------------------------------------------ >> >The AMD 29050 has an embedded FPU claimed to have a peak speed of 80 >> >MFLOPS, with frequencies from 20Mhz to 40Mhz (two flops per cycle?). >> Yes, that's right, two flops per cycle. ......... > > Even more interesting: when can one get a solid 29050 C compiler > exploiting all these goodies ? Both the MetaWare compiler and a GCC compiler have the capability to generate the instructions that execute 2 flops. Both compilers are "in testing" and are expected to be generally available this year. Here's some sample code produced by one of the compilers: float t1 (float *res1, float *v1, float *v2, float scale, float offset) { int i; float accum = 0.0; float accum2 = 0.0; for (i = 0; i < 100; i++) { accum += v1[i] * v2[i]; accum2 += v1[i] * (- v2[i]); } *res1 = accum2 * scale + offset; return accum; } _t1: sll gr119,lr5,0 const gr116,0 mtacc gr116,1,3 mtacc gr116,1,0 const gr116,396 add gr118,lr4,gr116 L5: load 0,0,gr116,lr3 load 0,0,gr117,lr4 fmac 0,3,gr116,gr117 fmac 1,0,gr116,gr117 add lr4,lr4,4 cple gr116,lr4,gr118 jmpt gr116,L5 add lr3,lr3,4 fmsm gr116,gr119,lr6 mfacc gr96,1,3 jmpi lr0 store 0,0,gr116,lr2