Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!water!watmath!clyde!cbosgd!ihnp4!homxb!genesis!hotlg!nz From: nz@hotlg.UUCP Newsgroups: comp.lang.c Subject: C Problem (or, GOTOs considered harmful) Message-ID: <129@hotlg.ATT> Date: Tue, 29-Sep-87 13:51:23 EDT Article-I.D.: hotlg.129 Posted: Tue Sep 29 13:51:23 1987 Date-Received: Fri, 2-Oct-87 02:03:36 EDT Reply-To: nz@hotlg.UUCP (Neal Ziring) Distribution: na Organization: AT&T-BL Dept. 54315 Lines: 94 Keywords: labels scope goto symtab One of the developers came upon a very interesting problem with our C cross-compiler the other day. The compiler runs on a VAX (SysV.2) and produces code for the M68000/10. He was write a few modules, one of which contained a label and a few goto statements. When he finished writing the code, he went to try to build a complete product including his new code (a complete load, as we call it). Well, it bombed with a message about the symbol ``cleanup'' being undefined. That symbol was his goto label -- and it appeared right there in the file. The structure of the code was something like this: #include "hugefile.h" foo * dosomefoo(x,y,z) int x,y,z; { int a,b,c; printf("a message"); { short s1, s2; char *cp; if (diagnose(x) < 0) { printf("x failed"); goto cleanup; } if (diagnose(y) < 0) { printf("y failed"); goto cleanup; } diagnose(z); } cleanup: a = x * y; dosomebar(a); return(NULL); } What is the problem here, you ask? Why was it that I, the compiler support person, could not duplicate the problem when I tried to write some little test programs. Why was it that whenever the number of declarations in the dozens of #included header files was changed, the problem would go away? Why was it that other files with similar structure did not evoke the fatal-error behaviour? Can amybody guess? Don't hit the space bar until you've tried... The problem has to do with scoping. When the compiler begins to compile the inner block (where s1 and s2 are declared) it places those variables in the symbol table, linked into the list of symbols for that scope level. Fine. When the compiler comes upon the goto statements, it places the destination symbol, ``cleanup'' into the symbol tables as an undefined extern label. Now, when the compiler finishes compiling this inner scope block it removes all the block-local symbols from the symbol table. Their slots in the symbol table are therefore free. One line later, the compiler comes upon a label, ``cleanup:''. It hashes the name, and gets a symbol table index. Using the index, it starts searching the linear symbol array for 1) a label named ``cleanup'' or 2) an empty symbol table slot well, normally it would have to find no. 1 first. But since the presence of the small scope block caused the compiler to allocate and then de-allocate several symbol table slots, the compiler ends up finding a free slot first. For instance, the undefined extern label ``cleanup'' might have been placed at 1875, after ``cp'' at 1874 (assuming they hash near eachother). Well, ``cp'' was removed from the symbol table, so the search for ``cleanup'' starting at hash index 1800 will find the empty slot at 1874 before it finds the correct occupied slot at 1875. Hmmmm! Ah Hah! This is an epiphenomenon arising from the linear search collision scheme used in PCC2-based C compilers. Naturally, this is only likely to occur when the symbol table is close to full, so that collisions and (I call them) anti-collisions happen often. Whewee! No wonder I had trouble duplicating the bug! Has anybody else ever observed anything like this? Is the code wrong? Is this compiler simply a poor one? I would like to fix it, but I cannot simply give the compiler a more sophisticated symtab management scheme, because this linear searching discipline is hardcoded into several parts of the compiler. Arghh! Comments? -- ...nz (Neal Ziring @ ATT-BL Holmdel, x2354, 3H-437) "You can fit an infinite number of wires into this junction box, but we usually don't go that far in practice." London Electric Co. Worker, 1880s