Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!shadooby!samsung!brutus.cs.uiuc.edu!apple!usc!ucla-cs!oahu!stephen From: stephen@oahu.cs.ucla.edu (Steve Whitney) Newsgroups: comp.sys.atari.st Subject: Re: UNIX -- ATW Speed Message-ID: <28655@shemp.CS.UCLA.EDU> Date: 31 Oct 89 00:28:35 GMT References: <545@nikhefh.nikhef.nl> <5188@cbnewsh.ATT.COM> Sender: news@CS.UCLA.EDU Reply-To: stephen@oahu.cs.ucla.edu (Steve Whitney) Organization: UCLA Computer Science Department Lines: 61 In article <5188@cbnewsh.ATT.COM> wolf@cbnewsh.ATT.COM (thomas.wolf,ho,) writes: [performance numbers deleted] ~ ~These claims are very misleading in that they imply that a system with 13 ~transputers will actually run a program 1/5th as fast as a Cray (I'm assuming ~the above MIPS figures are correct.) _Each_ T800 has a processing power ~of 10 MIPS (the upper limit at which instructions can be processed.) So ~if you have a calculation-intensive application, say one that does ray-tracing, ~you can expect that peek to be reached. If you ran this same program on ~a 13-transputer based ATW, I doubt whether you will see significant increases ~in performance, since that application will be run on a single T800 (unless ~it was specifically written with parallelism in mind This is somewhat true. Although Crays perform well on scalar code, their real speed is only evident on vectorizable problems. As on the Transputer, applications must be specifically engineered to perform well. It is true that there are Fortran compilers which are _very_ good at vectorizing code, and C compilers which can do the same are starting to appear, but since C isn't deisgned for vectors, it's not as easy to do. ~ - using OCCAM(I guess that ~is the T800 assembly language?) - I don't think there are compilers smart ~enough to take conventional C programs and parallelize them.) Occam is a high level language, but it is designed to be simple. The name comes from Occam's razor which was said to be able to separate what was necessary from what was unnecessary. Occam has constructs for making parallelization easy so it can use as many processors as are available. The application you mentioned (ray-tracing) happens to be an extremely easily parallelizable problem (I've done it on a four processor SGI Iris) so it might very well reach the peak performance of the T800s. ~I am not familiar as to how applications have to be written to run on a CRAY, ~but my guess is that you can use conventional C programming. So if you took ~the above program, recompiled and ran it on a CRAY, you _would_ see significant ~speed improvements (if 100MIPS = 1/5 Cray, I guess the program would run ~at 500 MIPS?) ~ And again, if you compiled said application with a parallelizing compiler, you would see vast improvement on a 13 processor transputer system. It's true that communication overhead will detract from the transputer performance and not the Cray. Take a look at CMU's iWarp processor. It's a good example of a machine designed to be good on its own and _better_ when networked. The link speeds make the transputer look like a 6502. [philosophical comments on benchmarks deleted] ~Tom ~ ~-- ~+---------------+-----------------------------+ I don't remember, ~| Tom Wolf | Phone: (201) 949-2079 | I don't recall, ~| Bell Labs, NJ | E-mail: twolf@homxb.att.com | I have no memory, ~+---------------+-----------------------------+ Of anything at all. P. Gabriel But then again, this probably belongs in comp.arch anyway... Steve Whitney "It's never _really_ the last minute" (())_-_(()) UCLA Comp. Sci. Grad. Student | (* *) | Internet: stephen@cs.ucla.edu UCLA Bruin--> { \_@_/ } GEnie: S.WHITNEY `-----'