Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!nstn.ns.ca!news.cs.indiana.edu!samsung!usc!apple!netcom!teda!ditka!mcdchg!tellab5!balr!clrcom!rmartin From: rmartin@clear.com (Bob Martin) Newsgroups: comp.lang.c Subject: Re: Execution time bottleneck: How to speed up execution? Keywords: Execution time, would like to speed up. Alternatives? Message-ID: <1991Feb14.170400.8089@clear.com> Date: 14 Feb 91 17:04:00 GMT References: <21658@yunexus.YorkU.CA> Organization: Clear Communications, Inc. Lines: 83 >Execution time query: > > for (j=1; j <= n; j++) { > sum = 0.0; > for (i=1; i<= n; i++) { > sum += exp( con * (x[i]-x[j]) * (x[i]-x[j]) ); > } > a[j] = k*sum; > } > >This part of the algorithmn is a bottleneck. I want to make this as >quick as possible. What are my options? I have heard that assembler >would be faster. How much faster would assembler be? Would it be worth >the effort? Are there other alternatives that you can see? Any >improvement would be welcome. I am not a math whiz, I couldn't see any way to get the exp out of the inner loop. However there are a few things that can make the program run significantly faster... Assume that x runs from 0 - n-1 instead of 1 - n. double x[MAXSIZE], a[MAXSIZE]; register double *ip, *jp, *ap; double diff; register int j, i; int n = MAXSIZE; /* or whatever size you decide on */ for(j=n, jp=x, ap=a; j; j--, jp++, ap++) { sum = 0.0; for(i=n, ip=x; i; i--, ip++) { diff = *ip - *jp; sum += exp( con * diff * diff ); } *ap = sum; } Array subscripting is often slower than direct pointer dereferences. Also the subtraction of x[i]-x[j] need only be done once. Putting the pointers into registers may also help quite a bit. The changes I have made here are similar to the trade offs that an assembly language programmer would make. Further translating to assembly language may increase your performance yet again, I would guess that it would not be dramatically more effective than the above. ----- Here is another thought which may reduce the O complexity of the algorithm from n^2 to < (n^2+n)/2. (Although it will make the code a bit more complex) The matrix S[i][j] = exp(con * (x[i]-x[j])^2) is symetric about the i==j axis. S[i][j] == S[j][i]!!!. Also, the diagonal always evaluates to 1.!!! This means that you need calculate less than half of the S values. for (j=0; j