Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!utcsri!greg From: greg@utcsri.UUCP Newsgroups: comp.arch,comp.lang.c Subject: Re: String Handling ( really fixed-length copy ). Message-ID: <4558@utcsri.UUCP> Date: Sun, 12-Apr-87 12:44:11 EST Article-I.D.: utcsri.4558 Posted: Sun Apr 12 12:44:11 1987 Date-Received: Sun, 12-Apr-87 17:35:27 EST References: <15292@amdcad.UUCP> <7897@utzoo.UUCP> Reply-To: greg@utcsri.UUCP (Gregory Smith) Organization: CSRI, University of Toronto Lines: 27 Xref: utgpu comp.arch:837 comp.lang.c:1563 Summary: fixed-size block copy hack. This string op-stuff gave me an idea. A run-time library could contain a function called 'mov200words' looking like this : mov200words: mov (a0)+,(a1)+ mov (a0)+,(a1)+ ..... 200 mov's in all mov (a0)+,(a1)+ rts Then, if, say, a 64-word struct needed to be copied, the compiler would get the pointers and then call mov200words+(200-64)*2 [ or whatever ] to do the copy. This would provide unrolled-loop speed with only one loop unrolled in the whole executable. [ Call it more than once for >200 words ]. Presumably this would be faster than a loop on a PDP-11 or a 68000, but might lose on a machine with an instruction cache, that could run a copy loop on-chip. A wizzo block copy instruction may or may not run faster than the unrolled loop. The only advantage I am claiming over other unrolled-loop techniques is the almost complete lack of anything but payoff move operations in the above, whilst avoiding large amounts of code whenever a copy is done. Of course, this must have been done before :-) -- ---------------------------------------------------------------------- Greg Smith University of Toronto UUCP: ..utzoo!utcsri!greg Have vAX, will hack...