Path: utzoo!attcan!sobmips!uunet!mailrus!ncar!asuvax!mcdphx!udc!chant!aglew
From: aglew@urbana.mcd.mot.com (Andy-Krazy-Glew)
Newsgroups: comp.arch
Subject: Re: Self-modifying code
Message-ID: <AGLEW.89Nov8194427@chant.urbana.mcd.mot.com>
Date: 9 Nov 89 00:44:27 GMT
References: <1080@mipos3.intel.com> <2630@gandalf.UUCP> <1989Nov6.172043.10373@world.std.com>
Sender: aglew@urbana.mcd.mot.com
Organization: Work: Motorola MCD, Urbana Design Center; School: University of Illinois at Urbana-Champaign
Lines: 209
In-reply-to: bzs@world.std.com's message of 6 Nov 89 17:20:43 GMT
Fcc: NEW/arch

>Basically, it's a performance hack which can make a system much more
>useable on interactive performance. There are systems sensitive to an
>extra IF statement in their inner-loops, particularly when the
>condition check is expensive (a few instructions can be expensive in
>this context) and the inner loop can be executed millions of times per
>request. There's always a way to replace it, but it may not result in
>acceptable performance.

On a system I worked on adding a conditional branch to the critical
path for syscall added 5-10% (correct me if I'm wrong, Wayne) for
minimal syscall timings.  Maybe not much overall, but annoying.
    (This was a "semi-static" test, which we got around by making up a
separate head and changing the syscall vector).

We have several times talking about self-modifying code on the
functional level. Perhaps it is just a question of notation - perhaps
we just need a "statement data type - or a question of optimization -
perhaps a good compiler should generate self modifying code in some of
the hairy situations we have talked about in this thread, given
high-level code that is not self modifying.

Conside Barry's example:

[B]
    ...predecessor code...
L:  {IF request_is_flagged THEN do_flag_action}
    ...successor code...

Note that I have put curly braces {} around the statement to indicate the
range to which the label L applies.
    In a high level notation the self-modification that Barry described,
removing the IF at the code that changes the assertion, may be described
as:

[S]:
    ...Elsewhere, conditions have changed
    ...so now the test does not need to be performed,
    ... ie. !request_is_flagged:
    L := skip /* Dijkstra's high level nop */
    ...

    ...Elsewhere, conditions have changed
    ... ie. request_is_flagged:
    L := do_flag_action;
    ...

This involves assignment to a variable of type "statement".

In a functional notation you might do

[F]:
    L: pointer to function

    L_skip(){}

    L_do_flag_action() { ... }

    ...predecessor code:
    (*L)();
    ...successor notation

where the code that changes the request_is_flagged status changes the
function pointer: L := &L_skip, L := & L_do_flag_action.
    In a decent language you could declare the function pointer
and the values it assumes at the point at which it is used.

(Yeah, great liberties with notation...)


Obviously [S] is equivalent to [F], although more/less convenient.


To begin with, [F] can be implemented in the obvious way, through
indirect function call/return.

[C]
    L_skip:
    	return

    L_do_flag_action:
    	...
    	return

    ...predecessor code:
    call indirect through L
    ...successor notation


    The first level of optimization is to notice that L_skip and
L_do_flag_action always return to the same place, reducing function
calling overhead to indirect branch.

[G]
    L_skip:
    	goto L'

    L_do_flag_action:
    	...
    	goto L'

    ...predecessor code:
    goto indirect L
L': /* return place */
    ...successor notation


(By the way, this is an optimization that would be nice in any system that makes
calls through arrays of pointers to functions: like UNIX syscalls (sysent[]),
file system calls (NFS, the System V file system switch), device driver calls
([bc]devsw[]).  99% of these functions are called in one and only one place.)


The next level of optimization is to inline the code with an IF - getting us back
to the explicitly hand coded version:

[IF]
    ...predecessor code:
    IF L == L_skip THEN
    	skip
    ELSE
    	do_flag_action
    FI
    ...successor notation

(Of course, a good programmer might have done this manually: instead
of indicating request_was_flagged in a complicated data structure, she
might have encoded it in a single variable, which she effectively did
with the function pointer.  So, the complexity of the test doesn't
concern us - just the overhead of the IF or call)


The next step is to actually do self-modifying code on the "micro" scale:

[SM]
    ...predecessor code	    L_skip: 	L_do_flag_action:
L1:    ___	 	      	nop 	    do_1
L2:    ___  	    	    	nop 	    do_2
L3:    ___  	    	    	nop 	    do_3
    ...successor code


    ...Elsewhere, conditions have changed
    ...so now the test does not need to be performed,
    ... ie. !request_is_flagged:
    L1 := nop
    L2 := nop
    L3 := nop
    ...flush cache or whatever else is necessary
    ...

    ...Elsewhere, conditions have changed
    ... ie. request_is_flagged:
    L1 := do_1
    L2 := do_2
    L3 := do_3
    ...flush cache or whatever else is necessary
    ...

Optimizations based on the code body might be desirable:

    do_action_1:    	    	do_action_2:
    	step1	    	    	    step1
    	step2.1	    	    	    step2.2
    	step3	    	    	    step3
    	step4.1	    	    	    step4.2
    	step5	    	    	    step5

Instead of 5 instructions changing, only 2 need be changed.


Alternatively, rather than micro self-modifying code, a compiler might
take [IF] and attempt to broaden the domain of the code affected. For
example, if [IF] were called within the body of another indirectly
called function that assumed values A and B, two copies of the bodies
of each of A and B might be made...


Okay, enough time wasted.  I think I have several points winding through
all this:

    (1) Self-modifying code at the "function" level is easily expressible
    	(somebody was trying to make a standard library out of this
    	(I was supposed to help, sorry))

    (2) "Micro" self-modifying code is easily expressible
    	(a) with an augmented language
    	(b) with function pointers.

    (3) It would not be unreasonable for an optimizing compiler to
    	make the sort of optimizations described above automatically,
    	acheiving the performance advantages of "micro" self modifying
    	code from a high level representation.

But, is it worthwhile spending the effort on a compiler that can do this
sort of thing?  Probably not... the situations where "micro" self modifying
code are really necessary are few and far between.  Force the implementors
to do it by hand...   Not commercially important.  Might be fun for
some academic research...

    
--
Andy "Krazy" Glew,  Motorola MCD,    	    	    aglew@urbana.mcd.mot.com
1101 E. University, Urbana, IL 61801, USA.          {uunet!,}uiucuxc!udc!aglew
   
My opinions are my own; I indicate my company only so that the reader
may account for any possible bias I may have towards our products.