Path: utzoo!telly!attcan!uunet!bu.edu!bu-cs!snorkelwacker!apple!usc!brutus.cs.uiuc.edu!zaphod.mps.ohio-state.edu!tut.cis.ohio-state.edu!cs.unc.edu!alexande From: alexande@cs.unc.edu (Geoffrey D. Alexander) Newsgroups: gnu.gcc.bug Subject: Performance anomoly using constructor expressions Message-ID: <9001121726.AA22371@dopey.cs.unc.edu> Date: 12 Jan 90 17:26:09 GMT Sender: news@tut.cis.ohio-state.edu Distribution: gnu Organization: GNUs Not Usenet Lines: 72 I have encountered a strange peformance anomoly using contstructor expressions. Consider the following program. ====test4a.c=================================================================== typedef struct { double real; double imaginary; } complex; #define ADD_COMPLEX(x, y) \ (complex){(x).real+y.real, (x).imaginary+(y).imaginary} #define MULT_COMPLEX(x, y) \ (complex){((x).real*(y).real)-((x).imaginary*(y).imaginary), \ ((x).real*(y).imaginary)+((y).real*(x).imaginary)} main() { complex c; int i; complex x; c=(complex){1.0,1.0}; x=(complex){0.0,0.0}; for (i=1;i<=10000;i++) { x=ADD_COMPLEX(MULT_COMPLEX(x,x),c); } exit(0); } =============================================================================== Now, modify the this program so that the value of MULT_COMPLEX(x,x) is saved in a temporary variable. ====test4b.c=================================================================== typedef struct { double real; double imaginary; } complex; #define ADD_COMPLEX(x, y) \ (complex){(x).real+y.real, (x).imaginary+(y).imaginary} #define MULT_COMPLEX(x, y) \ (complex){((x).real*(y).real)-((x).imaginary*(y).imaginary), \ ((x).real*(y).imaginary)+((y).real*(x).imaginary)} main() { complex c; int i; complex x; complex y; c=(complex){1.0,1.0}; x=(complex){0.0,0.0}; for (i=1;i<=10000;i++) { y=MULT_COMPLEX(x,x); x=ADD_COMPLEX(y,c); } exit(0); } =============================================================================== Now, compile the programs as follows: gcc test4a.c -O -o test4a gcc test4b.c -O -o test4b Running on a Sun3-60M (w/o floating point chip) under SunOS Release 4.0.3, test4a takes 4.1 seconds, while test4b takes only 2.3 seconds. Anyone care to explain why? Note that I am using gcc version 1.36. Geoff Alexander