Path: utzoo!news-server.csri.toronto.edu!cs.utexas.edu!usc!zaphod.mps.ohio-state.edu!unix.cis.pitt.edu!dsinc!bagate!cbmvax!jesup From: jesup@cbmvax.commodore.com (Randell Jesup) Newsgroups: comp.sys.amiga.programmer Subject: Re: New life for MOVEM! Message-ID: <19519@cbmvax.commodore.com> Date: 5 Mar 91 07:50:31 GMT References: <1991Feb11.160212.7749@vax1.tcd.ie> <1991Mar2.042511.7894@vax1.tcd.ie> Reply-To: jesup@cbmvax.commodore.com (Randell Jesup) Organization: Commodore, West Chester, PA Lines: 41 In article <1991Mar2.042511.7894@vax1.tcd.ie> hughesmp@vax1.tcd.ie writes: >In article , dej@qpoint.amiga.ocunix.on.ca (David Jones) writes: >> Ya. Save yourself some code. Check out CopyMem() in exec.library >> (V33 or greater). Disassemble it. Essentially, it is the above code. > >Hey cmon man, he doesn't want to hear about supplied software. Often you >find stuff written by someone else, particularly the OS, sucks. You want ... >will be able to use it. Not just people with V33 or greater, whatever >that is. V33 is 1.2. Anyone who is running anything earlier than 1.2 deserves 10 lashes with a wet noodle (since 1.0 and 1.1 were only available on A1000's, and they can upgrade in a snap - almost all modern stuff requires 1.2). >waste your money on a bigger chip in the series? Don't say find out about >the OS, because it is a heap of it. You want _real_optimisation_ for the >specific problem, for which some general ideas may help. Movem is one. The >OS is not. Matt Dillon's program is very nice, coping with non-word >boundaries and everything, but if you want _everything_ out of the machine, >forget those checks. Align your data, and use the plain movems. Shove the >loop in a cupboard, and in-line the code. Guess what: what you suggest is exactly what's in the OS. There's CopyMem(), for non-aligned data (ala matt's), and CopyMemQuick(), for aligned data. It can't inline the code, but if you're transferring enough data for movem-loops to make a difference, the cycles for a single subroutine call to start it is WAY down in the noise (plus you win in that on a chip-only machine, ROM access can be far faster than ram access, depending on video mode). And if you happen to run your code on 2.0 with an '020 or better, suddenly your copies get even quicker, since we have separate copy loops for different processors. -- Randell Jesup, Keeper of AmigaDos, Commodore Engineering. {uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.commodore.com BIX: rjesup The compiler runs Like a swift-flowing river I wait in silence. (From "The Zen of Programming") ;-)