Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uwm.edu!linac!mp.cs.niu.edu!ux1.cso.uiuc.edu!news From: mcdonald@aries.scs.uiuc.edu (Doug McDonald) Newsgroups: comp.lang.c Subject: Re: Execution time bottleneck: How Message-ID: <1991Feb22.005402.21864@ux1.cso.uiuc.edu> Date: 21 Feb 91 12:30:51 GMT References: <492@tin.crc.ac.uk> <233522@<1991Feb12> <20900015@inmet> Sender: news@ux1.cso.uiuc.edu (News) Organization: University of Illinois at Urbana Lines: 50 In article <492@tin.crc.ac.uk> dcurtis@crc.ac.uk (Dr. David Curtis) writes: > >>In article <21658@yunexus.YorkU.CA> racine@yunexus.yorku.ca (Jeff Racine) writes: >>> for (j=1; j <= n; j++) { >>> sum = 0.0; >>> for (i=1; i<= n; i++) { >>> sum += exp( con * (x[i]-x[j]) * (x[i]-x[j]) ); >>> } >>> a[j] = k*sum; >>> } >>>This part of the algorithmn is a bottleneck. I want to make this as >>>quick as possible. What are my options? I have heard that assembler >>>would be faster. How much faster would assembler be? Would it be worth >>>the effort? Are there other alternatives that you can see? > > } > >Now what follows I'm less sure of. The main aim is to avoid calculating >array offsets and just increment pointers instead. Can anyone say if >that kind of thing produces significant speed-ups? > > float *ap,*xjp,*jp,*a1p,*x1p; > a1p=&a[1]; > x1p=&x[1]; > for (ap=a1p, xjp=x1p, j=n; j ; --j, ++ap, ++xjp) { > float xj; > xj=*xjp; > sum = 0.0; > for (xip=x1p, i=n; i; --i, ++xip) { > float xjfromi; > xfromj=*xip-xj; > sum += exp( con * xfromj * xfromj); > } > *ap = k*sum; > } > > Sometimes yes, sometimes no. I have tried benchmarking this sort of stuff extensively. However on many compilers an exp (a single one) is sufficiently sluggish to override the benefit. On the other hand --- on MOST compilers using two dimensional arrays (either a[i][j] or a[i+m*j] style) is sufficiently bad that the exp can't swamp it and using pointers is a substantial help. This kind of optimization is more common in Fortran compilers, but even there is not universal. Try it on your compiler and see. Doug McDonald