Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!mips!ptimtc!nntp-server.caltech.edu!toddpw From: toddpw@nntp-server.caltech.edu (Todd P. Whitesel) Newsgroups: comp.sys.apple2 Subject: Re: Animation Keywords: Animation Message-ID: <1991Apr17.095436.10764@nntp-server.caltech.edu> Date: 17 Apr 91 09:54:36 GMT References: <1991Apr17.061057.22357@cs.uow.edu.au> Organization: California Institute of Technology, Pasadena Lines: 72 u9050728@cs.uow.edu.au (Shane Kelvin Richards) writes: [ stuff deleted ] > My question is, is my basic ideas/techniquie correct? Am I using >the wrong method for fast shape manipulation. OR am I using the correct >method and I should just try to improve upon my code and optimise where >I can? Your method is reasonable, but the time-wasters are pretty obvious. Read on. > For simplicity I only let shapes move by 2 pixels so that they >always fall on a byte boundary. Also, I am usin the 320x200 resolution. Time-waster #1: loops. If you are looping through the picture data and the mask then you are spending a non-trivial amount of time in the loop overhead. Unrolling consists of coding a long string of instructions with the offsets hardcoded as the addresses; the index register(s) are used to hold the low word of the data address. You can do truly evil things this way if you map the SHR buffer to the stack (better disable interrupts temporarily though!): lda 0,x ;dp points to object location on screen and |0,y ;DBR/Y points to mask ora |$1000,y ;suppose the image is 4K past the mask sta 0,x lda 1,x and |0,y ora |$1000,y sta 1,x ... Note that the above example does assume the mask and image start at a fixed distance from each other. It is a speed vs. memory tradeoff. Time-waster #2: rectangular objects. Depending on the types of objects you want to animate, it may actually help to pack the image and its mask so that dead space in the object rectangle is replaced by offset/length values for each line of the object. This is almost always a win. Time-waster #3: the mask itself. If you can afford to let the mask be per byte and not per pixel, you can get even more speed but at real memory expense -- you hardcompile each object into code that draws it by simply storing it (using the index w/ hardcoded offset technique from above). If you want EVEN MORE speed you can use the stack to push bytes directly onto the picture (this looks sick but is actually pretty easy to do once you know what's involved). What's cool about stack-romping is that you can push arbitrary words with PEA's, repeat values and one-byte values with pha/phx/phy, and skip bytes with a sbc #xxxx; tcs; sequence (if you let A accumulate the hops that is -- a simple way to do this would be to pass the location of the object as the byte address of its last byte, so the object draw code can start with a tcs). The major drawback here is that you have hardcompiled code PER OBJECT -- I haven't tried to do this yet but I suspect that the code is about as large as the image & mask data so you are losing a bit of mask resolution but not much else. Time-waster #4: the shadowing itself. If you are going to be drawing over objects a lot then you should turn off shadowing while you are drawing the scene and then turn it back on and do a single romp copy of the bank 1 SHR buffer onto itself -- this can be done by remapping memory, the stack & dp, and issuing a series of pei $fe pei $fc ... pei $2 pei $0 and hopping the dp register after each page. I am not positive but I strongly suspect that both #3 and #4 are used by the FTA Space Harrier demo. Todd Whitesel toddpw @ tybalt.caltech.edu