Path: utzoo!attcan!sobmips!uunet!mailrus!ncar!asuvax!mcdphx!udc!chant!aglew From: aglew@urbana.mcd.mot.com (Andy-Krazy-Glew) Newsgroups: comp.arch Subject: Re: Self-modifying code Message-ID: Date: 9 Nov 89 00:44:27 GMT References: <1080@mipos3.intel.com> <2630@gandalf.UUCP> <1989Nov6.172043.10373@world.std.com> Sender: aglew@urbana.mcd.mot.com Organization: Work: Motorola MCD, Urbana Design Center; School: University of Illinois at Urbana-Champaign Lines: 209 In-reply-to: bzs@world.std.com's message of 6 Nov 89 17:20:43 GMT Fcc: NEW/arch >Basically, it's a performance hack which can make a system much more >useable on interactive performance. There are systems sensitive to an >extra IF statement in their inner-loops, particularly when the >condition check is expensive (a few instructions can be expensive in >this context) and the inner loop can be executed millions of times per >request. There's always a way to replace it, but it may not result in >acceptable performance. On a system I worked on adding a conditional branch to the critical path for syscall added 5-10% (correct me if I'm wrong, Wayne) for minimal syscall timings. Maybe not much overall, but annoying. (This was a "semi-static" test, which we got around by making up a separate head and changing the syscall vector). We have several times talking about self-modifying code on the functional level. Perhaps it is just a question of notation - perhaps we just need a "statement data type - or a question of optimization - perhaps a good compiler should generate self modifying code in some of the hairy situations we have talked about in this thread, given high-level code that is not self modifying. Conside Barry's example: [B] ...predecessor code... L: {IF request_is_flagged THEN do_flag_action} ...successor code... Note that I have put curly braces {} around the statement to indicate the range to which the label L applies. In a high level notation the self-modification that Barry described, removing the IF at the code that changes the assertion, may be described as: [S]: ...Elsewhere, conditions have changed ...so now the test does not need to be performed, ... ie. !request_is_flagged: L := skip /* Dijkstra's high level nop */ ... ...Elsewhere, conditions have changed ... ie. request_is_flagged: L := do_flag_action; ... This involves assignment to a variable of type "statement". In a functional notation you might do [F]: L: pointer to function L_skip(){} L_do_flag_action() { ... } ...predecessor code: (*L)(); ...successor notation where the code that changes the request_is_flagged status changes the function pointer: L := &L_skip, L := & L_do_flag_action. In a decent language you could declare the function pointer and the values it assumes at the point at which it is used. (Yeah, great liberties with notation...) Obviously [S] is equivalent to [F], although more/less convenient. To begin with, [F] can be implemented in the obvious way, through indirect function call/return. [C] L_skip: return L_do_flag_action: ... return ...predecessor code: call indirect through L ...successor notation The first level of optimization is to notice that L_skip and L_do_flag_action always return to the same place, reducing function calling overhead to indirect branch. [G] L_skip: goto L' L_do_flag_action: ... goto L' ...predecessor code: goto indirect L L': /* return place */ ...successor notation (By the way, this is an optimization that would be nice in any system that makes calls through arrays of pointers to functions: like UNIX syscalls (sysent[]), file system calls (NFS, the System V file system switch), device driver calls ([bc]devsw[]). 99% of these functions are called in one and only one place.) The next level of optimization is to inline the code with an IF - getting us back to the explicitly hand coded version: [IF] ...predecessor code: IF L == L_skip THEN skip ELSE do_flag_action FI ...successor notation (Of course, a good programmer might have done this manually: instead of indicating request_was_flagged in a complicated data structure, she might have encoded it in a single variable, which she effectively did with the function pointer. So, the complexity of the test doesn't concern us - just the overhead of the IF or call) The next step is to actually do self-modifying code on the "micro" scale: [SM] ...predecessor code L_skip: L_do_flag_action: L1: ___ nop do_1 L2: ___ nop do_2 L3: ___ nop do_3 ...successor code ...Elsewhere, conditions have changed ...so now the test does not need to be performed, ... ie. !request_is_flagged: L1 := nop L2 := nop L3 := nop ...flush cache or whatever else is necessary ... ...Elsewhere, conditions have changed ... ie. request_is_flagged: L1 := do_1 L2 := do_2 L3 := do_3 ...flush cache or whatever else is necessary ... Optimizations based on the code body might be desirable: do_action_1: do_action_2: step1 step1 step2.1 step2.2 step3 step3 step4.1 step4.2 step5 step5 Instead of 5 instructions changing, only 2 need be changed. Alternatively, rather than micro self-modifying code, a compiler might take [IF] and attempt to broaden the domain of the code affected. For example, if [IF] were called within the body of another indirectly called function that assumed values A and B, two copies of the bodies of each of A and B might be made... Okay, enough time wasted. I think I have several points winding through all this: (1) Self-modifying code at the "function" level is easily expressible (somebody was trying to make a standard library out of this (I was supposed to help, sorry)) (2) "Micro" self-modifying code is easily expressible (a) with an augmented language (b) with function pointers. (3) It would not be unreasonable for an optimizing compiler to make the sort of optimizations described above automatically, acheiving the performance advantages of "micro" self modifying code from a high level representation. But, is it worthwhile spending the effort on a compiler that can do this sort of thing? Probably not... the situations where "micro" self modifying code are really necessary are few and far between. Force the implementors to do it by hand... Not commercially important. Might be fun for some academic research... -- Andy "Krazy" Glew, Motorola MCD, aglew@urbana.mcd.mot.com 1101 E. University, Urbana, IL 61801, USA. {uunet!,}uiucuxc!udc!aglew My opinions are my own; I indicate my company only so that the reader may account for any possible bias I may have towards our products.