Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.2 9/18/84; site wicat.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!gamma!epsilon!zeta!sabre!petrus!bellcore!decvax!mcnc!philabs!cmcl2!seismo!utah-cs!utah-gr!uplhercules!wicat!mike
From: mike@wicat.UUCP (Mike Hibler)
Newsgroups: net.arch
Subject: WICAT DRYSTONE results and an observation
Message-ID: <144@wicat.UUCP>
Date: Wed, 18-Dec-85 12:04:05 EST
Article-I.D.: wicat.144
Posted: Wed Dec 18 12:04:05 1985
Date-Received: Sat, 21-Dec-85 06:29:53 EST
Reply-To: mike@wicat.UUCP (Mike Hibler)
Organization: WICAT Systems, Orem, Utah
Lines: 59


RESULTS (formatted to plug into the comments at the beginning of dry.c):
------------------------------------------------------------------------------
As is:

 * WICAT MB	68000-8Mhz	System V	WICAT C 4.1	 585	 731 ~
 * WICAT MB	68000-12.5Mhz	System V	WICAT C 4.1	1246	1537 ~
 * WICAT PB	68000-8Mhz	System V	WICAT C 4.1	 998	1226 ~
 * WICAT PB	68000-12.5Mhz	System V	WICAT C 4.1	1530	1898 ~

Using shorts in place of ints:

 * WICAT MB	68000-8Mhz	System V	WICAT C 4.1	 675	 853 S~
 * WICAT MB	68000-12.5Mhz	System V	WICAT C 4.1	1450	1814 S~
 * WICAT PB	68000-8Mhz	System V	WICAT C 4.1	1169	1464 S~
 * WICAT PB	68000-12.5Mhz	System V	WICAT C 4.1	1780	2233 S~

Notes:

 *   ~   For WICAT Systems: MB=MultiBus, PB=Proprietary Bus
------------------------------------------------------------------------------

The 8 Mhz Multibus and all proprietary bus systems access memory across
the system bus, the 12.5 Mhz Multibus CPU has on-board memory.

WICAT's System V is derived from UniSoft's Uniplus+ and AT&T's version
5.2 release 1.

The WICAT C compiler is derived from the original MIT 68000 compiler and
AT&T's PCC.


OBSERVATION (stop me if you have heard this before...):
------------------------------------------------------------------------------
Some "optimizers" may ruin the instruction mix which the program
attempts to achieve.  For example in the following piece of code
from the main loop of Proc0:

	for (CharIndex = 'A'; CharIndex <= Char2Glob; ++CharIndex)
		if (EnumLoc == Func1(CharIndex, 'C'))
			Proc6(Ident1, &EnumLoc);
	IntLoc3 = IntLoc2 * IntLoc1;
+	IntLoc2 = IntLoc3 / IntLoc1;
+	IntLoc2 = 7 * (IntLoc3 - IntLoc2) - IntLoc1;
	Proc2(&IntLoc1);

our optimizer effectively removes the marked statements since IntLoc2
is "dead" (i.e. it is reset at the beginning of the loop and not used
after the loop).  This eliminates a 32-bit by 32-bit division, a runtime
function call on a 68000 based machine, and possibly a 32-bit by 32-bit
multiplication (though more likely a short sequence of adds since the
multiplication is by a constant).  By eliminating the optimization
(via a test of IntLoc2 after the loop), the net loss in "speed" was
less than 2%, but who knows what other optimizations lurk out there!
------------------------------------------------------------------------------

				Mike Hibler
				WICAT Systems
				utah-cs!uplhercules!wicat!mike