Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!mcvax!ukc!warwick!rlvd!asw From: asw@tony.UUCP (Tony Williams) Newsgroups: net.lang.c Subject: Re: Generating code for the switch statement Message-ID: <1611@rlvd.UUCP> Date: Wed, 20-Aug-86 07:26:32 EDT Article-I.D.: rlvd.1611 Posted: Wed Aug 20 07:26:32 1986 Date-Received: Wed, 27-Aug-86 10:17:04 EDT References: <15093@ucbvax.BERKELEY.EDU> <2765@brl-smoke.ARPA> <610@hropus.UUCP> <840@edison.UUCP> Sender: news@rlvd.UUCP Reply-To: asw@tony.UUCP (Tony Williams) Organization: Informatics Division, R.A.L Lines: 81 Summary: All compilers ought to do it - its easy. (Longish, but tells you how) In article <840@edison.UUCP> jso@edison.UUCP (John Owens) writes: > >In article <610@hropus.UUCP>, ka@hropus.UUCP (Kenneth Almquist) writes: >> If the cases are not consecutive, the branch table must be filled with >> entries for the "missing" cases, making the table larger. [....] If >> the cases are mostly consecutive with a few outliers, a branch table >> could be augmented with a few specific tests for the outliers, but I >> don't know of any compiler that does this. > >The Microsoft Pascal/Fortran compilers, at least some of the old >ones (3.04 and 3.11) do this. I haven't tried it to see if their >wonderful new code generators do. > I am not sure that this is original, as I think it has been used in BCPL compilers and the like for ages. I implemented this in a C compiler for an unusual machine (a Three Rivers Perq running the CMU Accent operating system, with our own Unix emulation layered on top). See comments following for a discussion of the advantages. The algorithm is roughly as follows (barring off-by-one errors - my favourite): genSwitch( caseLabels, addresses, labelCount ) int *caseLabels; /* array of label values */ *addresses; /* array of addresses */ int labelCount; /* number of labels */ { int labelRange = caseLabels[labelCount-1] - caseLabels[0]; if( labelRange <= acceptable_sparseness * labelCount ) /* suitably dense */ genJumpTable( caseLabels, addresses, labelCount ) else { /* too sparse - generate some tests * first find some suitable median value of label, * say one that divides the table in half for simplicity */ int median = findMedian( caseLabels, labelCount ); /* eg labelCount / 2; */ /* result is place in table, not value */ genIfStmt( switchExpr, caseLabel[median], less_than ); /* now do the lower half (expr < median) */ genSwitch( caseLabels, addresses, median ); startElse(); /* now the upper half (expr >= median) */ genSwitch( &caseLabels[median], &addresses[median], labelCount-median ); } } This algorithm has several nice features: - the sparsesness threshold can be tuned without hacking code - the use of recursion for each portion of the divided switch means that locally dense clusters in a sparse switch can each have their own jump table, reached by test-and-jumps - the tests for out-of-range expressions can be factored in to the algorithm, - the choice of division point can be made more intelligently if you wish (eg by maximising the density of one or both portions), by only changing findMedian() - non-dense cases are effectively handle by binary search, coded as a jump tree. This approaches optimal, if there is no information on dynamic frequencies. The generated code (subject to findMedian) is usually within one extra test of being optimal. Not bad for a simple-minded algorithm. As a footnote, the virtual machine architecture had no indirect jump instruction, so the same technique (and code) was used here by Trudy Watson in implementing FORTRASH computed goto's and (shudder) assigned goto's. --------------------------------------------------------------------------- Tony Williams |Informatics Division UK JANET: asw@uk.ac.rl.vd |Rutherford Appleton Lab Usenet: {... | mcvax}!ukc!rlvd!asw |Chilton, Didcot ARPAnet: asw%vd.rl.ac.uk@ucl-cs.arpa |Oxon OX11 0QX, UK