Path: utzoo!utgpu!watserv1!watmath!att!occrsh!uokmax!munnari.oz.au!samsung!zaphod.mps.ohio-state.edu!swrinde!mips!daver!bungi.com!news Newsgroups: comp.sys.nsc.32k Subject: Dhrystone 2.1 Message-ID: <9009112335.AA02996@manatee.UUCP> Date: 12 Sep 90 03:35:51 GMT Sender: news@daver.bungi.com Lines: 108 Approved: news@daver.bungi.com Let me begin by saying that I am no dhrystone wizard. Out of the box I am consistently getting 7692.3 dhrystones / second for both 'dry2' and 'dry2reg' (ie dhrystone 2.1, 500000 iterations). As far as I can determine the only meaningful libc functions that are linked in are strcmp.o and strcpy.o. So next I wrote assembly versions of these routines using the string instructions. With this change I am getting 8771.9 dhrystones / second consistenly. Included with the dhry2.1 source code was a compulation of various dhrystone results. Summarizing the results for NS32000 series processor: dry2 dry2reg ----- ----- Encore 32032 10Mhz 1323 1323 Encore 32332 15Mhz 3059 3071 Aeon 32332 15Mhz 3413 3413 Aeon 32532 25Mhz 9998 9998 Encore 32532 25Mhz 11117 11223 The Encore 32532 results are inline with those reported by Dave Rand to this news group. Now, for my question. Is the difference between 8771 that I am getting and the reported 11000 figure do only to compiler efficiency, or are the other factors entering the picture. My naive assumption is there must be other factors since the code generated by GCC looks damm good to me. BTW: As soon as I get motivated to cleanup / double check the code, I will post assembly versions of the following string functions: memchr.s memcmp.s memcpy.s memmove.s memset.s strchr.s strcmp.s strcpy.s strlen.s strncat.s strncmp.s strncpy.s strrchr.s As you can see the performance improvement is generally 2X - 3X. Function New time as percentage of old time --------------------- ----------------------------------------- memcpy(s1, s2, n): [n=4]: 50 [n=25]: 38 [n=1024]: 32 memmove(s1, s2, n): [n=4]: 31 [n=25]: 31 [n=1024]: 32 strcpy(s1, s2): [s2=ATOE]: 71 [s2=ATOZ]: 56 strncpy(s1, s2, n): [s2=ATOZ,n=10]: 58 memcmp(buf, buf2, n): [n=4]: 38 [n=25]: 10 [n=1024]: 2 strcmp(s1, s2): [2*ATOE]: 67 [2*ATOZ]: 66 strncmp(s1, s2, n): [n=4]: 67 [n=25]: 56 memchr(ATOZ, c, 25): [c='E']: 63 [c='Z']: 42 strchr(ATOZ, c): [c='E']: 75 [c='Z']: 50 strrchr(ATOZ, c): [c='A']: 83 [c='E']: 82 [c='Z']: 53 memset(buf, 0, n): [n=4]: 180 [n=1024]: 29 strlen(s): [s=ATOE]: 71 [s=ATOZ]: 59 Best regards, johnc ------------------------------------------------------ DHRYSTONE 2.n RESULTS SORTED BY MANUFACTURER Sun Apr 29 12:37:27 EDT 1990 |--------------------------------------------------------------------------------------------------------------------------------------------------------------| |manuf |model |proc |clock|os |osver |compiler |cver |options | noreg| reg|notes |date |submit | |--------------------------------------------------------------------------------------------------------------------------------------------------------------| |AEON |332/AT |NS3233|15.00|GENIX |V.3 |NS CTP |2.4 |-O -KC332 | 3413| 3413| |03/12/88|John Behrs | |Technologi| |2 | | | | | | | | | | |(boulder!fesk! | |es | | | | | | | | | | | | |ativax!john) | |__________|________________|______|_____|________|________|____________|________|____________|_______|_______|__________________|________|____________________| |AEON |532/AT |NS3253|25.00|GENIX |V.3 |NS CTP |2.4 |-O -KC532 | 9998| 9998|pipelining |03/12/88|John Behrs | |Technologi| |2-A1 | | | | | | | | |disabled, chip | |(boulder!fesk! | |es | | | | | | | | | | |restrictions in | |ativax!john) | | | | | | | | | | | | |effect | | | |__________|________________|______|_____|________|________|____________|________|____________|_______|_______|__________________|________|____________________| |Encore |MULTIMAX |32032 |10.00|Mach | | | |-O -q | 1323| 1323|1 of 16 processors|03/14/88|Lawrence Butcher | |Computer | | | | | | | |novolatile | | | | | | |__________|________________|______|_____|________|________|____________|________|____________|_______|_______|__________________|________|____________________| |Encore |MULTIMAX |32332 |15.00|Mach | | | |-O -q | 3059| 3071|1 of 16 processors|04/16/88|Lawrence Butcher | |Computer | | | | | | | |novolatile | | | | | | |__________|________________|______|_____|________|________|____________|________|____________|_______|_______|__________________|________|____________________| |Encore |Multimax 320 |NS3253|25.00|Umax4.2 |A3.3 |C-32000 |1.8.4 |-O -q o=t | 11117| 11223|Alpha HW. |12/03/88|James R. Grier | |Computer | |2 | | | |Green Hills | | | | |Production will be| | | | | | | | | |Software | | | | |30Mhz. | | | |__________|________________|______|_____|________|________|____________|________|____________|_______|_______|__________________|________|____________________| |National |NS32GX32 |32GX32|30.00|Compiled|GNX |GNX - |3.4 |-O -KC532 | 16087| 16087| 0 wait state |03/09/89|Jonathan Levy | |Semiconduc|Evaluation Board| | |and down|Version |Version 3 C | | | | |2-way interleave | |(nsc!levy) | |tor | | | |loaded |3 |Optimizing | | | | |static ram. | | | | | | | |under | |Compiler | | | | | | | | | | | | |GNX | |(CTP) | | | | | | | | |__________|________________|______|_____|________|________|____________|________|____________|_______|_______|__________________|________|____________________| |Sequent |BALANCE 8000 |32032 | |Mach | | | |-O | 1058| 1110|1 of 32 processors|03/14/88|Lawrence Butcher | |__________|________________|______|_____|________|________|____________|________|____________|_______|_______|__________________|________|____________________| --