Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!pt.cs.cmu.edu!sei!firth
From: firth@sei.cmu.edu (Robert Firth)
Newsgroups: comp.arch
Subject: Re: Self-modifying code
Message-ID: <4415@bd.sei.cmu.edu>
Date: 10 Oct 89 19:17:14 GMT
References: <1080@mipos3.intel.com>
Reply-To: firth@sei.cmu.edu (Robert Firth)
Organization: Software Engineering Institute, Pittsburgh, PA
Lines: 63

In article <1080@mipos3.intel.com> jpoon@mipos2.intel.com (Jack Poon~) writes:

>Could any experts out there educate me WHY and HOW does self-modifying code use?
>What the advantage of using self-modifying code that non-self-modifying code
>cannot achieve?

In the spirit of this question, I won't discuss why the current consensus
is that self-modifying code is bad, but rather will give examples.

Directly self-modifying code is pretty rare nowadays, but used to be
an essential tool.  Consider how you access an array element, A[i],
on a modern machine:

	Load index-register, i
	Load accumulator, address-of-A[0](index-register)

Some older machines didn't have index registers, so your compiler
generated this:

	Load accumulator, i
	Add accumulator, address-of-A[0]
	Store accumulator, right-half-of-instruction-L
L:	Load accumulator, contents-of-address-0

So, by the time you executed L, the right operand had been modified to
read 'contents-of-address-A[i]'.  And, basically, you HAD to do it this
way.

Some debuggers still play this trick.  If you set a breakpoint at L,
the debugger actually overwrites instruction L with 'trap-to-debugger',
of course saving the old instruction for subsequent execution.

----

Rather more common, are systems where code is generated and executed
dynamically.  Of course, any code overlay system is like that: a data
area is allocated, filled with bits read from a file, and then jumped
into and obeyed as code.  A more sophisticated system allows the
dynamic replacement of code bodies by alternative versions; thus, if
during debugging you suspect the new version of "munge()" is bad, you
replace it with the old version, read in from some file in the program
development environment, and carry on running.

This doesn't need true self-modifying code; rather, it needs dynamic
variability of procedure bodies (eg called via pointers), and the
ability to remap pieces of memory from data space into code space.

----

Finally, one very powerful technique is used in some implementations
of prototyping languages.  Typically, such languages are interpreted,
so each statement in 'Object Logo Plus', or whatever, is turned into
a data structure that is crawled over by an interpreter.

The trick is this: keep track of how often a code fragment (procedure,
method, or whatever) is modified, and how often it is executed.  When
a certain stability threshold is reached, compile the sucker into true
machine code, and have all subsequent executions be direct rather than
interpreted.  Over a prototyping session of an hour or two, the user
will observe a speedup of 2x to 20x, as the more stable and heavily used
parts of the program get converted gradually into machine code.

Hope that helps.