Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!usc!apple!ames!uhccux!munnari.oz.au!bruce!goanna!minyos!otto!athos From: athos@otto.bf.rmit.oz.au (David Burren [Athos]) Newsgroups: comp.graphics Subject: Mandelbrot/Julia optimizations (was Re: Excluding Mandelbrot set) Message-ID: <601@otto.bf.rmit.oz.au> Date: 1 Dec 89 00:52:42 GMT References: <7106@ficc.uu.net> <3544@quanta.eng.ohio-state.edu> <234@xochitl.UUCP> Organization: RMIT, Vic., Australia Lines: 86 In article <234@xochitl.UUCP>, cheeks@edsr.eds.com (Mark Costlow) writes: >In article <7106@ficc.uu.net>, peter@ficc.uu.net (Peter da Silva) writes: >> One thing that can be done to speed up calculations of the mandelbrot set is >> to use memory. After you have determined that a point is in the set, you know >> that all the other points visited in determining that are in the set as well, >> so you can mark them black. Similarly, if you determine a point is not in the >> set you know that all the other points you visited are not in the set as >> well, so you can mark them white (or whatever colors you're using). This is >> a pretty safe optimisation. > > Hmmm ... maybe I'm missing something here, but that doesn't seem to be a > valid algorithm. Since the Mandelbrot algorithm is Z(n+1) = Z(n)^2 + C, then > each iteration for a given C defines a "path" through the plane, stopping > at several (possibly infinite) discrete points. But, isn't the path different > for each different C? I mean, just because you saw point P when iterating > with C1, doesn't mean that if you see point P when iterating on C2 that > the C2 iteration will behave identically to the C1 iteration. The term "path" seems to have mislead you a little. The expression Z(n+1) = Z(n)^2 + C says nothing about Z(n-1). If you get to point X from any previous point then you WILL proceed to the same next point. Of course, this is subject, like everything else, to rounding errors. The many-to-few property of the expression is outlined by the fact that complex numbers have two square roots each. However the search time to determine if you have found a point previously makes Peter's scheme sound rather daunting. This would however be dependent on memory architecture. My own work on blotting (several years ago someone here at RMIT made a comparison of a Mandelbrot image to an ink blot, and the name stuck :-) has largely centered around using parallel hardware and optimising the code for specific machines. Things like cache make enormous differences to performance. I try to avoid memory accesses of any sort in the inner per-point loop, keeping data in registers and instructions in cache. Machines with 8 FP registers are able to keep all the data on-chip, although brain-dead compilers that refuse to optimise, or optimise with the proviso that it keeps one register free for intermediate expression values, often do their best to impede the programs. People often seem to ignore the performance gains they can achieve with relatively little effort. Playing with compiler options for a Pascal program running on a Cyber 170 got the inner loop to sit within the cache (just) with a performance jump of close to an order of magnitude. Even algorithms presented in books such as "The Science of Fractal Images" have (what seems to me now) glaring innefficiencies, calculating parts of expressions twice, etc. I've seen many fancy "optimizations" suggested over the years, but although they reduce the "number of calculations", the "work per calculation" seems to increase too much to make it worth it. Comments anyone? Call me a hacker, call me obsessive, but optimising for specific architectures (which can take some time with some of the weirder ones) is worth every hour of saved run-time. It takes the exploration of the Mandelbrot and Julia sets out of the realm of batch jobs and into that of user interaction. Even simple HLL code optimizations can make significant improvements. I ended up with a set of C routines for Unix machines, with #define's for various architectural features. The code is well optimised by standard C compilers for HP-9000/300s, Sun-3s, Cyber-180s, VAXen, Multimaxes, Iris 4Ds. There are essentially two versions: one for 2-register instruction machines such as Motorola and NatSemi FPUs, and one for 3-register-instruction beasts such as VAXen and MIPS boxes. The Cyber-170 work produced a function written in COMPASS (assembly) callable from Pascal to calculate single points. The code fits within less than half the 170-760's cache and takes advantage of the multiple functional units for parallelism. It takes just over two minutes to calculate a 1280x1024 image of the full set (-2,-1.25 .. 0.5,1.25) with a dwell limit of 256, and using 60-bit FP numbers. This compares with several hours for the older Pascal programs. The reason the code was written to be optimal for a wide variety of machines is because I have an unfinished distributed "fractal server" to use the coarse-grained parallelism inherent in the Internet. I intend to rewrite this to use RPC. I saw in comp.sys.sun recently mention of an RPC-based Mandelbrot program in the 1987 SUG tape. Can anyone here tell me an easy way to get hold of that? These programs have been untouched for several months, and are in dire need of cleaning up, which I intend to do very soon. Anyone interested in the code can get them from me, although I'd prefer to work on them a bit more first. _______________________________________________________________________________ David Burren (Athos), ACSnet: athos@otto.rmit.oz Survey Computing Consultants, Hawthorn, Vic. Brought to you by Super Global Mega Corp .com