Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!cs.utexas.edu!tut.cis.ohio-state.edu!ucbvax!agate!e260-1g!c60a-3hu
From: c60a-3hu@e260-1g.berkeley.edu (Calvin Cheng)
Newsgroups: comp.sys.apple
Subject: Re: ROM 04 GS and resolution
Message-ID: <1990Feb17.224548.3960@agate.berkeley.edu>
Date: 17 Feb 90 22:45:48 GMT
References: <16747.apple.net@pro-sol> <2228@ultb.isc.rit.edu>
Sender: usenet@agate.berkeley.edu (USENET Administrator;;;;ZU44)
Reply-To: c60a-3hu@e260-1g (Calvin Cheng)
Organization: University of California, Berkeley
Lines: 65

In article <2228@ultb.isc.rit.edu> lmb7421@ultb.isc.rit.edu (Les Barstow: Phoenix) writes:
>I don't think we're being fair here - C compilers for the 68000 series
>are very well developed and ironed out for optimization.  However, the
>GS C compilers have not yet been written for speed - I seem to remember
>a comparison done back when the GS came out, all code written in
>Assembly for both machines, and the GS did rather well...

Compilers are what u need to use for most software developments.
It doesn't make sense to compare assembled programs because assem-
blers are just not practical unless under demanding situations.
One main reason why the Mac is slower than MS-DOS machines in
benchmarks is because Mac compilers are less developed than MS-DOS
compilers but it's a fact that everybody has to live with. The most
important fact is we are not comparing the raw processors here.

>
>Comparing a 2.5 MHz GS to an 8MHz Mac, the GS actually won the Sieve
>race, came close in others, and managed to hold some ground on the
>rest... Remember, the average 65816 cycles/instruction is ~5, while the
>minimum cycles/instruction on a 68000 series is ~4 (and goes up steeply
>from there in increments of 2 or 4 cycles (can't remember which, just
>remember seing instructions which took over 20 cycles to execute)...)
65816 takes 3 cycles to read a 16-bit word, the 68000/010 4 cycles
and the 68020/030 3 cycles for a 32-bit long word. The 030 has a
special burst fill mode that takes in 4 32-bit long words in 5
cycles! This is not implemented on the IIcx and SE/30 but on the
IIci. On top of that, the 030 comes with data and code caches of
32 32-bit long words each. The overlapping of instruction 
execution is such that instructions can take *0 cycles* to
execute. The 68040 is typically 3 to 4 times the speed of the
030 at the same clock rate. While people are still trying to
confirm the presence of a 20Mhz 816 (and the IIGS has the 2.8Mhz
one), the maximum clock rate for the 030 is now at 50Mhz.

>
>Don't under-estimate the power of the 65816 - even though it lacks some
>instructions (notably MUL and DIV) and some other features
>(multitudinous registers), but it is also more efficient and has some
>unique features of its own (direct page, fast increments, etc...)
>
Neither the 680x0 nor the 658xx families are true RISC chips. It'll be
interesting to watch for the next generation (68040 and 65832). A 68040
Mac (and NeXT) will probably make the debut by the end of the year. But for
now the fast-increments is equivalent to the ADDQ and SUBQ on the 680x0
(with the choice of more data registers), the direct page, a bigger but
more limited version of the 680x0's 8 data and 8 address registers.

My Dhrystone timings on APW C and THINK C:

APple IIGS w/o Transwarp  162/sec
Apple IIGS w Transwarp    274/sec
Mac Plus                  854/sec
Mac SE                   1035/sec
Mac SE/30                4300/sec

Timings I didn't do:

Apple IIe                  60/sec
Accorn Archimedes        5100/sec
NeXT                     5800/sec
DEC Vax 11/780 mini      2100/sec
Typical 25Mhz 386-clone  7000+/sec

By the way, Sieve under THINK C on my SE/30 takes abt 3.9s for 100 iterations
It takes about 56s under Orca/Pascal and almost 70 to 90s for APW C.