Path: utzoo!attcan!uunet!mcvax!hp4nl!philmds!leo From: leo@philmds.UUCP (Leo de Wit) Newsgroups: comp.sys.atari.st Subject: Re: ASSEMBLY MOVE/CLEAR/SET/COMPARE ROUTINES (was Clearing memory chain) Summary: it depends Message-ID: <609@philmds.UUCP> Date: 22 Aug 88 06:53:58 GMT References: <8808160343.AA10248@cory.Berkeley.EDU> <20383@watmath.waterloo.edu> Reply-To: leo@philmds.UUCP (Leo de Wit) Organization: Philips I&E DTS Eindhoven Lines: 62 In article <20383@watmath.waterloo.edu> egisin@watmath.waterloo.edu (Eric Gisin) writes: >Why do so many people think movem is faster than the >more straight-forward loop of move.l's? Because they may be right to do so 8-) >copying 12 long words with movem (a0)+,regs; movem regs,(a1); add.l Rn,a1" > takes 242 cycles on the 68000, >while 12 successive "move.l (a0)+, (a1)+" takes 240 cycles. >(timings derived from Motorola'a 68000 manual) I dunno how you got your timings, but I got the following: movem.l (a0)+,d1-d7/a2-a6 112 cycles movem.l d1-d7/a2-a6,(a1) 108 cycles adda.l d0,a1 8 cycles ---------------------------------------- TOTAL 228 cycles move.l (a0)+,(a1)+ 24 cycles ---------------------------------------- TOTAL (12 times) 288 cycles so that the movem.l construct gains you 60 cycles here: about 25% faster (it seems you've done a bit of cycle stealing 8-). These timings are derived by ACTUALLY TIMING them (repeating them a lot of times and measuring the time taken). I've got a nice little program that times a series of hexadecimal codes, anyone interested? B.T.W. I use movem.l because movem.w only moves the lower words of the regs (after having extended them). The adda.l may just as well be a adda.w because it is sign extended anyway. Note that timings on the Motorola cannot be derived from a manual, unless you know what timing is involved with memory access in your special case (an ST, now ain't that special). In this it differs from e.g. the Z80 where times must be synchronous; the bus does not wait (I hope my explanation isn't too bad, not being an hardware expert and so). On the Motorola bus the driver is more polite 8-). >when copying fewer words, move.l is even better. Of course, i.e. less worse. The gain in the movem approach is just that you do much less instruction fetching (see above: 3, be it somewhat longer instructions, against 12). No wonder movem.l is faster. >the move.l appoach does not require saving and restoring all >your registers, and can be coded in C with decent compilers. The overhead of saving and restoring registers is very little, because we can use the fast movem.l instructions for that 8-)!! No, seriously, if you move a large chunck, this overhead is quickly gained back. And of course, it can be coded in assembler with decent assemblers. As it can be in C with decent compilers (although this involves either #asm's or executing data). Leo. P.S. Some people may wonder why my smiley is most of the time a 8-). This is simply because I wear glasses 8-). And if not so, I forgot to put 'em on, or I'm just cleaning them :-). *-) Oops, seems I broke'em :-(