Path: utzoo!utgpu!utstat!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!ucbvax!pasteur!columbine.Berkeley.EDU!dean From: dean@columbine.Berkeley.EDU (R. Drew Dean) Newsgroups: comp.arch Subject: Re: Complex Instructions Message-ID: <13396@pasteur.Berkeley.EDU> Date: 7 May 89 06:22:48 GMT References: <57252@yale-celray.yale.UUCP> <4101@tolerant.UUCP> <134@dg.dg.com> <2253@pembina.UUCP> <1287@l.cc.purdue.edu> Sender: news@pasteur.Berkeley.EDU Reply-To: dean@columbine.Berkeley.EDU (R. Drew Dean) Organization: University of California at Berkeley Lines: 43 In article <1287@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: >I agree. But semi-portable and close to efficient is not impossible. You >may have to abandon an algorithm completely on some machines. In some cases >the algorithm may have to be entirely abandoned. This is especially the case >in generating random variables, where it is the method which must be correct, >NOT the answer. The CRAY-1 is so bad I consider it a basket case for >vectorization. I would suggest connecting a mini, or even a micro, to >do this part of the job. The algorithm I would use on the 205 differs >from what I would use on the IBM 3090; the 205 algorithm could be implemented >on the 3090, but would be decidedly more costly. Implementing the 3090 >algorithm on the 205 would require breaking up the vector into short pieces, >which is required on vector register machines such as the 3090 and CRAY, but >is not required on vector stream machines such as the 205. But to me, these >considerations are immediate and obvious. > >-- >Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Can you please provide examples of the different algorithms you use on these different machines ? I refer you to the ACM Turing Lecture book, where I paraphrase from (I forget whose lecture this was, the book is in the library where I'm not, but if my memory hasn't dropped too many refresh cycles it's McCarthy's...) Look at generating Fibonacci numbers. The obvious algorithm F(n) = F(n-1) + F(n-2) generates a process with exponential running time, but this can be transformed into a process with linear running time...[new algorithm omitted] I don't care how great your "optimizing" compiler is -- a compiler which sees this transformation (reducing the order of growth), generating not necessarily perfect code, will trounce your compiler.... Note: Ok, this would be a really difficult compiler to write....I don't know of anybody doing it yet. But it's the principle that matters here: It's _not_ how micro-efficient your code is that matters. If it's too slow, rethink the algorithm. That's where the real difference is -- if you reduce the order of growth, wonderful things start happening. If your random variable generator uses an algorithm that executes in the least known time, then please forgive this posting.... Drew Dean Internet: dean@xcssun.berkeley.edu UUCP: ...!ucbvax!xcssun!dean FROM Disclaimers IMPORT StandardDisclaimer;