Path: utzoo!attcan!uunet!mcvax!hp4nl!eurtrx!euraiv1!evas From: evas@euraiv1.UUCP (Eelco van Asperen) Newsgroups: comp.sys.ibm.pc Subject: Re: Microsoft Vs. Borland; benchmarks! Summary: MSC wins on points in the benchmark-battle... Message-ID: <788@euraiv1.UUCP> Date: 6 Oct 88 14:41:56 GMT References: <876@galaxy> <8254@haddock.ima.isc.com> Organization: Erasmus University EF/AIV,Rotterdam,Netherlands Lines: 291 [Here's a comparison of MSC and TurboC as my contribution to the "Microsoft vs. Borland" discussion. I wrote this a couple of months ago and posted it but that failed due to administrative reasons. Note that I don't have Turbo C v2.0 yet; if anybody wants the source of the benchmarks to run them for Turbo C v2.0, I'll be happy to send them. To make a fair comparison, you should run them on the same type of machine; I also have access to an Olivetti M24 (aka. AT&T 6300) and an Olivetti M280. -EvAs.] Benchmarking Borland's Turbo C v1.5 and Microsoft's C v5.1 Compilers INTRO To get some clarity in the continuing debate concerning the Microsoft and Borland C compilers, I've benchmarked them according to some of the benchmarks used in the article "Benchmarking C Compilers" which appeared in Dr.Dobb's Journal (DDJ), August 1986. (Philip Freidin, one of the authors, was kind enough to send them to me. Thanks, Phil!) The compilers compared are Microsoft C v5.1 (MS-C) and Borland Turbo C v1.5 (TC). All tests were done on an AT-clone, running at 12 Mhz with 0 wait-states under MS-DOS v3.3; to eliminate the speed of the hard disk from the results, I ran the programs on a ramdisk. The programs were compiled with all optimizations enabled; for MS-C, the flags are '-Ox -Gs' and for TC they are '-G -O -Z -r'. The tests were compiled and run for each memory model available on both compilers; TC's Tiny-model has not been included because MS-C hasn't got a comparable model (at least the compiler does not generate special code for it; I haven't checked if programs compiled with the small-model can be converted to com-files after linking). Each test was repeated a number of times to increase accuracy; the loop-count for each test is in the table. THE BENCHMARKS A brief description of the benchmarks used; ARRAY tests the compiler's ability to efficiently access arrays using conventional array operations. A 10x10x10 int-array is copied using three nested for-loops. ATOX tests the atoi, atol and atof functions; it has 21 atoi calls, 16 atol-calls, and 8 atof-calls. Each call passes a string constant, some of which have many leading blanks or zeros. CPYBLK copies a file of 10,000 bytes using fread and fwrite in 1024 bytes blocks. CPYCHR copies the same file but this time using fgetc and fputc; a comparison of the times for CPYBLK and CPYCHR should tell you more about the difference between block and character I/O. DISKIO does random seeks in a file of 240 Kb and thus measures the speed of fseek. FIBTEST is the standard recursive Fibonacci number generator. We call it for 24. This mainly tests function entry and exit code. FILLSCR writes 1,248 characters to the screen, consisting of sequences of 78 a's followed by a carriage return. This measures the speed of screen output in the absence of scrolling. (The test is done just after a CLS.) The FUNCOVR programs test function call overhead; they consist of procedures with zero, one, two and three arguments respectively and no body. DFUNCRET tests the ability to return function-values efficiently; the function returns a double. LOOPTST does a simple for-loop test. MEMORY was created to test the speed of malloc/free; per loop, 500 blocks of 50 bytes are malloc'ed. Then every fifth one is free'ed and 100 blocks of 35 bytes are malloc'ed, followed by a free of all allocated blocks. The MIN programs are used to determine the minimum size for a program; MINMAIN no code; this measures startup + exit code MINPRTF printf's in main MINPUTS uses puts rather than printf MINFIO calls to fopen, fgetc, fputc, fread, fwrite and fclose OPTIMIZE should test the compiler's ability to optimize code; as the authors of the DDJ-article note, this is one of the weakest benchmarks because even a relatively simple optimizer could reduce it to nothing. With the arrival of more and more optimizing compilers, this will become one of the hardest things to test. POINTER is a pointer-version of the ARRAY-test; it uses 6 pointers and three levels of indirection to copy the 10x10x10 array. PRTF is meant to determine the speed of printf; the results should be compared to the result of SCROLL. (They print the same line.) RSIEVE and SIEVE are versions of the infamous sieve benchmark program; RSIEVE uses register-variables whereas SIEVE does not. SCROLL is similar to FILLSCR but instead of the carriage-return, a newline is printed. STORAGE is used to determine the difference between the various storage classes in C; four variables are declared automatic, register and static. To see if the compiler will allocate more than two registers for variables, the register-test is also done with just two of the four variables declared as register. STRINGS assesses the quality of the library-routines strcat, strcpy, strncpy, strlen, strcmp, and strncmp. TDOUBLE and DFLOAT test floating-point performance; in each loop, 40 adds, subtractions and multiplies and 20 divides are done. A compiler that conforms the ANSI C standard (yeah, I know; this should read 'conforms to a draft version of' etc), should be faster than a compiler that conforms to K&R in the DFLOAT test because it doesn't have to convert floats to doubles before each operation. TINT and TLONG attempt to measure the performace of integer and long operations, respectively. For each loop, 1,500 adds, 1,600 subtracts, 200 multiplies and 200 divides are done. TRIG times the speed of the trigonometric functions sin, cos and tan. For each loop, these functions are called 12 times. And now for the real stuff; here are the... EXECUTION TIMES Model: Small Compact Medium Large Huge Test Loops TC MSC TC MSC TC MSC TC MSC TC MSC --------------+------------+------------+------------+------------+---------- array 1500| 24.9 2.4 | 25.5 2.4 | 25.0 2.4 | 25.5 2.4 | 25.5 2.4 atox 100| 1.1 1.7 | 1.2 1.7 | 1.2 1.7 | 1.2 1.7 | 1.2 1.7 cpyblk 15| 7.8 2.3 | 8.8 3.0 | 7.9 2.3 | 9.1 3.1 | 9.2 3.2 cpychr 15| 9.5 6.4 | 10.3 6.9 | 9.8 6.6 | 11.0 7.3 | 11.4 7.3 diskio 350| 15.7 15.6 | 15.7 15.6 | 15.7 15.6 | 15.7 15.7 | 15.8 15.6 fibtest 18| 14.1 13.4 | 14.4 13.4 | 15.3 14.5 | 15.4 14.5 | 17.9 14.5 fillscr 12| 9.0 3.2 | 9.0 8.9 | 9.0 3.3 | 9.0 8.9 | 9.0 8.8 funcov0 10000| 16.3 15.1 | 17.3 15.1 | 22.2 16.8 | 22.9 16.8 | 34.3 16.8 funcov1 10000| 22.7 22.6 | 22.9 22.6 | 28.0 26.0 | 28.3 26.0 | 35.6 26.0 funcov2 10000| 24.9 24.3 | 23.8 24.3 | 29.6 27.2 | 28.7 27.2 | 37.1 27.2 funcov3 10000| 29.7 28.3 | 30.6 28.2 | 34.8 31.8 | 35.6 31.7 | 43.0 31.8 ifuncret 2500| 12.0 11.7 | 11.8 11.7 | 13.1 13.4 | 13.7 13.4 | 17.4 13.4 lfuncret 2500| 16.7 15.1 | 16.2 15.1 | 18.9 17.6 | 19.1 17.6 | 22.3 17.6 dfuncret 250| 37.8 28.2 | 37.6 29.7 | 37.8 28.4 | 38.1 29.9 | 38.2 29.9 looptst 500| 7.6 0.0 | 6.9 0.0 | 7.6 0.0 | 6.9 0.0 | 6.9 0.0 memory 500| 30.8 11.7 |196.5 14.5 | 31.4 12.3 |198.8 15.3 |206.3 17.5 optimize 100| 4.0 0.5 | 4.1 0.6 | 4.1 0.5 | 4.0 0.6 | 4.0 0.6 pointer 1500| 6.8 5.2 | 12.5 2.5 | 6.7 5.2 | 12.4 2.5 | 12.5 20.8 prtf 12| 12.6 7.0 | 12.6 7.1 | 12.6 7.0 | 12.6 7.1 | 12.6 7.1 rsieve 140| 13.9 11.8 | 13.7 11.8 | 14.0 11.8 | 13.6 11.8 | 13.6 11.8 scroll 12| 12.4 6.5 | 12.4 12.3 | 12.4 6.5 | 12.4 12.3 | 12.3 12.3 sieve 140| 14.0 12.7 | 13.6 12.7 | 14.0 12.7 | 13.7 12.7 | 13.7 12.7 storage: autotst 150| 12.8 0.0 | 12.8 0.0 | 12.8 0.0 | 12.8 0.0 | 12.8 0.0 stattst 150| 15.2 0.0 | 16.1 0.0 | 15.2 0.0 | 16.1 0.0 | 15.2 0.0 regtest 150| 12.8 0.0 | 12.8 0.0 | 12.8 0.0 | 12.8 0.0 | 12.8 0.0 reg2test 150| 12.8 0.0 | 12.8 0.0 | 12.8 0.0 | 12.8 0.0 | 12.8 0.0 strings 1000| 2.0 1.7 | 2.0 1.7 | 2.0 1.7 | 2.0 1.7 | 2.0 1.7 switch1 1000| 0.6 1.8 | 0.6 1.8 | 0.6 1.8 | 0.6 1.8 | 0.6 1.8 switch2 1000| 0.6 0.7 | 0.6 0.7 | 0.6 0.7 | 0.6 0.7 | 0.7 0.7 switch3 1000| 0.6 0.7 | 0.7 0.7 | 0.6 0.7 | 0.7 0.7 | 0.7 0.7 tdouble 500| 21.0 10.3 | 21.0 10.3 | 21.0 10.3 | 21.0 10.3 | 21.0 10.3 tfloat 500| 22.6 10.1 | 22.6 10.1 | 22.5 10.1 | 22.6 10.1 | 22.6 10.1 tint 1500| 5.7 2.0 | 5.7 2.0 | 5.7 2.0 | 5.7 2.0 | 5.7 2.0 tlong 1000| 34.0 2.7 | 34.3 2.7 | 34.1 2.7 | 34.3 2.7 | 34.3 2.7 trig 100| 6.4 0.0 | 6.4 0.0 | 6.4 0.0 | 6.4 0.0 | 6.4 0.0 --------------+------------+------------+------------+------------+---------- all times are in seconds. CODE SIZE Small Compact Medium Large Huge TC MS-C TC MS-C TC MS-C TC MS-C TC MS-C -------- ---- ---- ---- ----- ---- ----- ---- ----- ---- ----- minfio 7560 9319 9700 12049 7806 9523 10474 12253 11978 12301 minmain 2402 4399 2942 4567 2472 4469 3012 4637 3382 4637 minprtf 6214 9081 7762 11315 6356 9263 7904 11497 9025 11497 minputs 4572 7233 6072 9691 4706 7373 6206 9847 7296 9847 (programs where compiled with all optimization-flags on.) In addition to these tests, I ran the dhrystone-program (compiled with the Small memory model); TC 2590 dhrystones/second MS-C 3401 dhrystones/second The results clearly show that the Microsoft compiler produces superior code when compared to Borland's. In a number of cases the MS-C code outperformed TC's by a factor of ten, for example with the TLONG and ARRAY tests. The good optimization in MS-C also provides some problems; since benchmarks are artificial and try to measure the efficiency of a certain type of operation, they are extremely prone to being optimized away, ie. reduced to no code at all. This is shown best by the LOOPTST, STORAGE and TRIG programs. We definitely need a new class of benchmarks for future tests. The only areas in which TC has the lead are switch-statements (and only marginally so) and the ATOX benchmark. The result of the MEMORY test are kind of dramatic for TC; these functions get very slow when using a large data-model, while MS-C performs more or less the same for all models. COMPILATION SPEED The price one usually pays for better optimization is longer compile-times; to check this, I timed the compilation and linking of the test suite for the Small memory-model. For TC, the Turbo Linker TLINK was used; as this is a limited yet fast linker, I reran the test for TC with the standard linker, the one that was also used for MS-C, MS LINK v5.01.04. Before running each test, I ran a disk-compression utility to make sure that file fragmentation would not distort the timings. In the DDJ-review, they used a different method to measure compilation speed. Since I don't have the files they used for this test, this will have to do. Compile and Link Times; Optimization Enabled Disabled ------- -------- TC with TLINK : 284.9 284.3 TC with LINK : 331.5 331.0 MS-C with LINK: 681.7 642.8 The compile time for the following program int alfa; should give us some idea of the amount of overhead associated with calling the compiler. Compile Load ------- ---- TC: 2.8 2.0 MS-C: 9.7 1.4 'Compile' is the total time required to compile this mini-program and 'Load' is the time needed to load the compiler. (All times are given in seconds.) CONCLUSIONS Based on the data presented here and my experiences with both products, Microsoft C wins the battle; it generates by far the best code. Turbo C's one-pass compiler has shorter compile times and creates smaller executables but the code produced is inferior to MS-C's. Furthermore, when it comes to writing a reference manual for a language the boys (and girls) at Borland could learn something from the Unix-community; start each reference on a separate page ! In its current form, the TC reference manual is a real pain to use. As they use the same style in the Turbo Pascal 3.0 and 4.0 manuals, I guess this is a Borland "feature" used to save paper and thus money on the cost of the manual. One of the things missing from both compilers (and from most PC C-compilers for that matter) is profiling, ie. the ability to get an overview of where your program spends most of its time when executing. As they can already do stack-overflow checking upon function entry, this should not be hard to add. Naturally, this test has not been as extensive as the one performed by the DDJ editors; their annual C issue will certainly contain an updated overview of the C compiler battlefield. [Well, DDJ ain't what it used to be; their last C compiler test was rather bleak when compared to the August 86 one. They left out the extensive tables that made the '86 review stand out. Refer to the comp.misc for the discussion on the death of DDJ....] -- Eelco van Asperen. uucp: evas@eurtrx / mcvax!eurtrx!evas #include earn/bitnet: asperen@hroeur5 #include "We'ld like to know a little bit about you for our files" - Mrs.Robinson, Simon & Garfunkel