Path: utzoo!dptcdc!jarvis.csri.toronto.edu!mailrus!husc6!rutgers!bellcore!texbell!vaxnix!sneaky!gordon From: gordon@sneaky.UUCP (Gordon Burditt) Newsgroups: comp.unix.wizards Subject: Re: libpw.a Message-ID: <9731@sneaky.UUCP> Date: 17 Apr 89 03:01:19 GMT References: <157@dftsrv.gsfc.nasa.gov> <10013@smoke.BRL.MIL> <8184@chinet.chi.il.us> <1250@frog.UUCP> Reply-To: gordon@sneaky.UUCP (Gordon Burditt) Organization: Gordon Burditt Lines: 105 >Programmers shouldn't use alloca(). The standard implementation of allocating >a chunk of stack is not applicable to all computer architectures or compilers, >and the non-stack-based implementations require too much conspiracy and >machination to be efficient and worthwhile. There is an even better reason not to use alloca(). I claim that there exists NO correct implementation of alloca() (in any released or unreleased version of anything) under these conditions: (1) Alloca is implemented using the stack. (not malloc() and some mechanism to keep track of what to free when), AND (2) The calling sequence is caller-pops-args, not callee-pops-args. AND, (3) The compiler doesn't implement alloca() as a built-in function (as opposed to only a function stuck in a library without special compiler knowledge of alloca(), which is hopeless). If you don't do (1), you are defeating the supposed advantages of alloca - speed, and either you have to do explicit freeing of memory somehow, or you might run out of memory sometime when you shouldn't. Most systems that don't do (2) have problems with old code because you have to actually follow the rules for varargs or stdargs to do variable-argument functions, or they break. You also need an instruction for "return and adjust stack". GCC has it as an option for the 68010, and the Vax can use it. Most compilers for various micros seem to use caller-pops-args. GCC implements alloca() as a built-in (assuming appropriate definitions in a header file), but still gets it wrong. The main problem here is that it is necessary to evaluate ALL arguments that involve alloca to a multi-argument function before pushing ANY of them. Before anyone starts yelling, "but mine works", try coming up with how the compiler is going to lay out the stack, considering function args and alloca-allocated memory, for the following code fragment. No fair putting everything in registers - if you have that many registers, then I'm going to keep doubling the number of args until you don't. Evaluating one arg (which calls alloca), pushing the result, then evaluating another arg (which calls alloca) is going to interleave arguments and allocated areas in a big mess, so when the function tries to look at its arguments, it gets uninitialized garbage. void *alloca(); /* please no void * vs. char * wars */ struct foo1 { .... }; struct foo2 { .... }; struct foo3 { .... }; struct foo4 { .... }; struct foo5 { .... }; struct bar { struct foo1 *a; struct foo2 *b; struct foo3 *c; struct foo4 *d; struct foo5 *e; }; struct bar initfunc(a, b, c, d, e) struct foo1 *a; struct foo2 *b; struct foo3 *c; struct foo4 *d; struct foo5 *e; { static struct bar ret; .... /* code fills in structures for a, b, c, d, and e */ ret.a = a; ret.b = b; ret.c = c; ret.d = d; ret.e = e; return ret; } main() { struct bar control; .... *(control = alloca(sizeof bar)) = initfunc( alloca(sizeof(struct foo1)), alloca(sizeof(struct foo2)), alloca(sizeof(struct foo3)), alloca(sizeof(struct foo4)), alloca(sizeof(struct foo5)) ); .... } Note: "...." means code has been left out; it is NOT a misspelling for a character sequence used in declaring ANSI varadic functions. Even if you do make the compiler know about alloca(), and rule that you can't use a pointer to alloca in a pointer-to-function to make function calls, the code can still get very messy and slow. If the compiler references auto variables as an offset from the stack pointer (rather than a frame pointer), all bets are off. If the compiler doesn't implement alloca as a built-in, you haven't a prayer of making even the simple cases work by writing a library function. Gordon L. Burditt ...!texbell!sneaky!gordon P.S. I have an implementation that sort-of works in the above example, provided you put a limit on the number of args. It wastes an amount of memory equal to the worst-case pushed args plus the worst-case saved registers ON EVERY CALL.