Path: utzoo!censor!geac!torsqnt!lethe!yunexus!ists!helios.physics.utoronto.ca!news-server.csri.toronto.edu!cs.utexas.edu!uunet!brunix!doorknob!jak From: jak@cs.brown.edu (Jak Kirman) Newsgroups: comp.lang.c++ Subject: Re: why is this program slow? Message-ID: Date: 9 Jan 91 14:28:45 GMT References: <1991Jan9.002244.23398@news.cs.indiana.edu> Sender: news@brunix.UUCP Reply-To: jak@cs.brown.edu Organization: Department of Computer Science, Brown University Lines: 103 In-reply-to: shirley@iuvax.cs.indiana.edu's message of 9 Jan 91 05:22:13 GMT In article <1991Jan9.002244.23398@news.cs.indiana.edu> shirley@iuvax.cs.indiana.edu (peter shirley) writes: I have a C++ program that takes about 170% as long as a similar C program. I've run it on a vax and a sun with and without big-O. I've inlined everything I can, and I don't understand the slowdown. Could someone enlighten me? (yes, I looked at the C code generated by CC, and am too dumb to understand it). [code using a vector class, adding vectors with operator+] Part of the problem at least is that you are using a function which returns an object. Typically what happens in this case (at least under cfront) is that the function is called, the return value copied into a temporary, and then the assignment function is called to copy the temporary into the destination. So with vector a, b, and a + operator which returns a vector, a = a + b; will result in a call to the + operator, which returns a vector. This vector is copied into the temporary by C, and then the assignment operator is called. So you are doing nearly twice the work. With a += operator, changing that line to a += b the time was about the same without optimization on a Sparc with AT&T 2.0. The time was still substantially faster in C with optimization; I am not sure what optimizations are being performed in C but not C++. Ellis & Stroustrup ARM 12.1.1c provides an explanation of why the optimizations necessary to avoid the extra copying would be very difficult for most compilers, and points out that in the case of initialization it often can be optimized. Below is a simple example of C++ code and a cleaned-up version of what is produced by cfront. I have changed the names of the temporaries and unmangled the names, and performed a few trivial "simplifications" of the code, such as changing (a,b) to a;b where the result of (a,b) is not used. In the example, when the result of foo is assigned to an existing X object, a temporary is created. When the result of foo is used to initialize an X object, no temporary is created. Note that things would be different were there a copy constructor for X; the function foo would be changed to take a pointer to the location where the result should be created. struct X { int a; int b; void operator= (const X& o) { a = o.a; b = o.b; } }; X foo () { X myx; return myx; } main () { X x1; x1 = foo (); X x2 = foo (); } struct X { int a ; int b ; }; struct X foo () { struct X myx ; return myx ; /* note that C returns a structure */ } int main () { _main(); { struct X x1 ; struct X x2 ; struct X *p ; { struct X tmp ; tmp = foo (); /* structure copied into tmp */ p = &tmp; (& x1 )-> a = *p.a ; /* operator= used to copy tmp into x1 */ (& x1 )-> b = *p.b ; x2 = foo ( ) ; /* no copy necessary } } }