Xref: utzoo comp.lang.c:26930 comp.lang.misc:4461 Newsgroups: comp.lang.c,comp.lang.misc Path: utzoo!henry From: henry@utzoo.uucp (Henry Spencer) Subject: Re: function calls Message-ID: <1990Mar15.173408.29622@utzoo.uucp> Organization: U of Toronto Zoology References: <1990Mar14.172152.25436@utzoo.uucp> <14268@lambda.UUCP> Date: Thu, 15 Mar 90 17:34:08 GMT In article <14268@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: >> ... On the Mips machines, for example, >> a C function call to a "leaf" function (one which calls no others) is >> one instruction for the call and one for the return, total elapsed time >> a small fraction of a microsecond. Non-leaf functions have to do a little >> bit more work at times. > >Most machines implement call/return with single instructions. However, this >tends to be the tip of the iceberg for procedure call overhead. The interface >_MUST_ do the following: > >1) Save/restore all register values that are still 'live' values for the > caller. ... The problem is: _MOST_ of the procedure call >overhead is concentrated in number (1)! And, this overhead applies to all >procedures - including 'leaf' routines. Basically, 'leafness' has very >little to do with procedure call overhead... Um, do you want to go back and re-think that statement, Jim? Leafness has a lot to do with it, because a leaf procedure doesn't have to keep things neat in case it calls somebody. In particular, if you assign some of your registers to be scratch registers, normally unused at call time, then a leaf procedure just uses them for all its internal work and has to save and restore *nothing* except in very unusual cases. This is exactly what the Mips compilers do. When I said that a call to a leaf function on a Mips machine was one instruction each way, that's *one* instruction, total, end to end, no further overhead involved. And those are RISC instructions, normally one cycle apiece. The correct statement is that the callee must save/restore all register values that are still live for the caller *and that the callee is going to overwrite*. Careful register-allocation conventions (with an adequate supply of registers) can minimize that number, usually to zero in the case of leaf functions. Most of the other overheads you mention can likewise be minimized or eliminated with some thought. Stack-overflow probing is needed only on systems without MMUs, or with defective MMUs; even the pdp11, the original C machine, did without it. Traceback overhead for debuggers can be reduced considerably with better cooperation between compiler and debugger, and a modern compiler at a high level of optimization generally produces stuff that a debugger can't handle anyway, so it can jettison that entirely. Interprocedural optimization in the presence of separate compilation is not merely possible, it's commercially available: -O3 on a Mips compiler. Much of the "common wisdom" on the subject of call overhead simply isn't true if you are willing to work hard at falsifying it. (Actually, the same is true of many other performance "constraints"...) -- MSDOS, abbrev: Maybe SomeDay | Henry Spencer at U of Toronto Zoology an Operating System. | uunet!attcan!utzoo!henry henry@zoo.toronto.edu