Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!seismo!lll-crg!caip!meccts!dicome!mmm!umn-cs!ncs-med!starfire!ddb
From: ddb@starfire.UUCP (David Dyer-Bennet)
Newsgroups: net.arch
Subject: Re: What's RISC all about ... REALLY?
Message-ID: <258@starfire.UUCP>
Date: Thu, 24-Jul-86 01:02:35 EDT
Article-I.D.: starfire.258
Posted: Thu Jul 24 01:02:35 1986
Date-Received: Fri, 25-Jul-86 00:10:50 EDT
References: <475@elmgate.UUCP> <564@bdmrrr.UUCP> <526@mips.UUCP> <447@mcgill-vision.UUCP>
Organization: Terrabit Software
Lines: 33

> In article <526@mips.UUCP>, mash@mips.UUCP (John Mashey) writes:
> ....................................................  Now the shock; the
> unrolled version was *faster*.  I forget how much, but it was enough  to
> offset the extra code space required by far.
> 
>      Explanation anyone?  Especially someone who knows VAX/750 microcode
> details?
> -- 
> 					der Mouse
> 
Can't claim knowledge of 750 microcode details, but I was rather interested
to discover when I looked into some things conveniently to hand that in
MOST cases a complex instruction runs slower than doing the same work on the
same processor with simple instructions.  It's also true of the INDEX
instruction on VAX, and of the LDB and string copy and edit instructions
on the PDP-10.  (instructions to simply move adjacent storage usually run
fast, as in the VAX MOVC3 or the Intel rep/movsw).

In the one case I could analyze at the microcode level, which may not be at
all typical, I finally decided that the problem was that all the fancy
provisions to make loops run fast, overlap things, etc., only worked above
the microcode level, so when you did complex stuff in microcode it didn't
get the assists.

Don't know if this is really a general rule, I haven't looked at this in
all that many architectures.

		-- David Dyer-Bennet
		Usenet:  ...ihnp4!umn-cs!starfire!ddb
		Fido: sysop of fido 14/341, (612) 721-8967
		Telephone: (612) 721-8800
		USmail: 4242 Minnehaha Ave S
			Mpls, MN 55406