Path: utzoo!news-server.csri.toronto.edu!cs.utexas.edu!wuarchive!zaphod.mps.ohio-state.edu!van-bc!rsoft!mindlink!a547
From: Chris_Johnsen@mindlink.UUCP (Chris Johnsen)
Newsgroups: comp.sys.amiga.emulations
Subject: Emulator Mechanics (sorry long post)
Message-ID: <4992@mindlink.UUCP>
Date: 4 Mar 91 15:20:48 GMT
Organization: MIND LINK! - British Columbia, Canada
Lines: 282


 The following messages were shared in an email environment between Charlie
Gibbs and myself over the last couple of days.  We decided that it would be a
good idea to share these thoughts with the readership of
comp.sys.amiga.emulations.  The messages speak for themselves for the most
part.

         ------------ Included text of messages ------------

Sun Mar 3 6:40:53 1991  From: Chris Johnsen [547]  Subject : Emulators

Hello Charlie!  Long time no talk to you eh?  I haven't seen you for quite some
time.  I was particularly fond of those after-PaNorAmA meetings at Wendy's a
few years ago.  I trust you recall me from back then.

I have been following the comp.sys.amiga.emulations newsgroup on Usenet for
quite awhile now and was quite interested in the IBeM program that is being
posted about recently.  In one of the near Alpha/Theta states that, being a
programmer,  such as yourself, one recognizes as some kind of "source"; I had
an idea. Since there are not many "new" ideas spawned I wanted to get some
feedback from a person with the expertise you possess.  The reasons I thought
of you for this private consultation are:  1) you've written an emulator  -
SimCPM 2) you've written an assembler - A68K  3) you have a broad knowledge  of
the computing field beyond lowly PC's and  4) you're such a nice guy!

I was musing about how an emulator would work.  I must confess I have no
concrete knowledge of this.  It has been said that "Not knowing how to do a job
'right' frees you to discover new and better ways of accomplishing the desired
result".  Freeman Patterson (a Canadian photographer), in his book Photography
and the Art of Seeing, called this "thinking sideways".  I  believe I've
thought of a way of writing an emulator that would run at least  as fast on the
Amiga as it would on the source machine.

My questions for you are:

Would you be interested in being a mentor to me in developing further this idea
or in fact ascertaining whether my concept has any validity?

I'm reticent to spill this on the net until I'm somewhat confident that it is
indeed practicable.  I have thought about the possible legal considerations of
producing an emulator.  There are potential commercial possibilities that one
could consider also.

Would you please give me just a quick, superficial rundown of the basic
algorithms used in developing an emulator?

I have assumed that you would read in the object module of say an IBM
executable, read it by opcode or routine, decipher the intent and then call a
library of glue routines to do the job that the program would have on an  IBM
or clone.  I have no idea how the interrupt structure would be handled but know
that you have done it with SimCPM.

I don't want to waste your time, but I would appreciate this information. I
would like to get your feedback on these questions to see if you are interested
in further discussion with me on this matter.  If you are not interested or
don't have the time I'll certainly understand.  I realize I'm asking you for
more information than I'm sharing but an idea is a tiny property and I'd like
to at least savor this for a while before I decide what to do with it.  That is
the sole reason I have not gone public with it yet.

If you are indeed interested, after you tell me on the highest level how an
emulator functions, I'll be able to describe this idea on some kind of
comprehensible basis. I'm looking forward to your response!  Thanks Charlie.

csj

Sun Mar 3 23:12:32 1991  From: Charlie Gibbs [218] Subject : Emulators

     Indeed I am somewhat short of time these days, but I wouldn't mind kicking
the odd idea around without getting too wrapped up in it.  I do understand your
idea of "thinking sideways" and enjoy being able to do it myself from time to
time.  A similar way I've heard it described is that even though it's been
proven that bumblebees can't fly, they don't realize this and so do it anyway.
Or to put it another way, I like to write programs that are too stupid to know
what they're doing, so they can do anything.

     Your ideas on emulation are basically in line with what's considered the
standard way of doing things.  A machine instruction is analyzed as to just
what it's supposed to do, and appropriate code then carries out the operation.
The guts of SimCPM appeared in Dr. Dobb's journal as an emulator that was meant
to run under CP/M-68K.  This made the job a bit easier for the original author,
since the CP/M-68K system calls were quite similar to the CP/M calls that he
was emulating.  I had to replace this portion of code with appropriate AmigaDOS
routines.  In addition, I extended the code to handle the full Z-80 instruction
set, since the original code could only handle the 8080 subset.

     Since emulating another processor in software is quite a CPU-intensive
process (several machine instructions have to be executed to emulate a single
machine instruction on the target machine) I tried to optimize SimCPM for speed
at the expense of memory and redundant code.  The overhead of a single
subroutine call, plus any extraction and interpretation of arguments, would
require several times as much time as a hand-picked set of instructions
dedicated to a single opcode.

     For system calls there's easy out - as soon as I recognize what the
emulated program is trying to do (e.g. read a block from a disk file), I call
the corresponding AmigaDOS routine, so I/O can proceed pretty well at native
speeds.  Therefore, even though CPU-bound programs might run 10% as fast as
they would on the target machine, I/O bound programs might get up to 50% of
speed.

     Interrupts were easy - since most CP/M systems don't use hardware
interrupts, except possibly for a very few hardware-dependent programs, I
simply didn't worry about them.  Software interrupts (the RST instruction) were
a snap on the other hand, since they're basically a special-purpose subroutine
call.

     The Intel 8080 is fairly easy to emulate because the opcode is uniquely
determined by the first byte of the instruction.  Some instructions might have
register numbers encoded in a few bits of that first byte, but I just treat
them as special cases.  To decode the byte, I just multiply its value by 4
(shift left 2 bits) and use the result as an index into a table of 256 pointers
to the actual emulation routines.  Since there are 64 possible MOV instructions
(well, 63 because one of the bit combinations is actually HLT) I actually have
63 MOV emulations, one for each combination of registers.  This means that I
don't have to do any register decoding, since each routine consists of a
dedicated 68000 MOVE instruction, followed by a jump back to the main emulation
loop.  Lots of almost-redundant code, but it's about as fast as it can get.

     This is getting kind of long-winded.  I'd be interested in hearing any
ideas you may have; although I wouldn't have time to get into the programming
of such stuff, I'm willing to act as a sounding board.  Talk to you soon... CJG

Sun Mar 3 23:18:48 1991  From: Charlie Gibbs [218]  Subject : Emulators

     Another approach I've heard of is to "compile" the code to be emulated
into native machine code.  This would involve a front-end program which would
read the target machine's program and analyze the instructions.  For instance,
if the "compiler" detects an instruction that does a move between two of the
emulated machine's registers, it would simply generate a move instruction in
the emulating machine's code.  It could generate either a translated assembly
language source file or a machine-language file ready to load into the
emulating machine.  This would require the "compilation" process to be run once
on the program to be emulated, and you'd then run the output of this
"compiler."  There are special tricks to consider here, such as resolving
addresses - you couldn't just copy the memory addresses across because the
emulated routines would likely be a different size.  It might be easier to
generate a label (e.g. Axxxx where xxxx is the hex address in question) in an
assembly source file and let the emulating machine's assembler sort it all out.

     I've never actually seen this process in action, but it's another
possibility.  --CJG

Mon Mar 4 12:38:31 1991  From: Chris Johnsen [547]  Subject : Emulators

Thanks a lot for your effort in explaining SimCPM to me man.  As you  describe
it, it would seem that I had intuitively understood the basic concepts.  I
would think that interrupts would be the hardest part to get down to reliable
operation.  What I had in mind, while thinking about  this, in general terms,
was an emulator that was non-specific as to the  machine, therefore I was
attempting to contemplate it handling say IBM,  Mac, (hey it may even be
possible to deal with Amiga emulation!) and  Atari ST on the Amiga and
imagining what the various architectures would  require.  All this on a very
abstract level.

Your second message hit the nail on the head!  I got bogged down at about the
level you describe in your first message.  Lots of details to be sought and
worked out.  Gee, I'm really not even available to code another program just
yet anyway.  I was giving the whole concept a rest when, what I  thought of,
kind of sideways (lazy minds tend to look for an easier way  around an
obstacle, sometimes unconsciously, even though this can lead to  harder, though
more elegant solutions to problems), was to read the opcodes from the "source
executable" of the emulated machine, producing an assembly listing of the
program.  This I imagine would be a two pass process, sort of like a C
compiler, followed by an assembler's two passes, and finished off with a
linker.

I thought that, if the compiler was "intelligent" enough, the output, though
likely larger, would be much faster than the common "interpreter type
emulator".  I had never heard of such an idea and since there are none out
there, wanted to discuss this with you.  I have developed the idea no further
than this in essence.

I did think of a few other considerations however.  If one could, indeed,
compile an executable image of say Lotus 123 from the IBM into a program which,
on a base Amiga, could run at half speed, or on a A2500 or A3000 at twice the
speed, it would be a viable alternative, besides being a neat toy.  However,
the standalone program generated would likely infringe on the copyright of
Lotus because the Amiga executable would actually be Lotus 123.  Take
WordPerfect for instance.  The latest version available is 4.1 or just a
micro-point higher, no problem, get hold of the IBM version 5.1, I believe it
is, and compile it and you have something some other people are wailing for. Of
course the rebuttal (I can hear you thinking?) is that, if a person owns
WordPerfect he has an inalienable right to run it. Run it on an IBM.  Run it on
a clone.  Run it through an Amiga compiler.  You know, if it's for personal
use, etc.

As to the increase in size of the "compiled emulation" program, I have a couple
of ideas.  First, the executable, though larger, would be standalone, except
for any support libraries.  This doesn't mean that this "form" of emulator,
more like a "translator of executables", would be any less efficient than the
"interpreter type".  Perhaps more memory efficient in a couple of possible
ways.  Since the interpretation section of the program is in the compiler, and
the source executable is not required at runtime, memory usage may well be less
with a "compiled emulation".  The second concept is to use link libraries which
would bind only the emulation routines required to the final program.  Possibly
a combination of bound-at-link-time modules of less frequently used routines
and a shared  library of essential routines all programs would need.  A solely
link library approach would leave this concept open to claims that pirates
could produce "warez" that need no extra code or setup to work.  Of course,
pirates appear to be capable of ripping anything off anyway!

This "compiled emulation" would, given sufficient memory and CPU
speed/efficiency, allow the running of multiple programs.  Both emulated
programs and standard Amiga programs.  Through the use of a shared library more
than one emulating program could be run without the overhead of multiple
"emulation interpreters" resident in memory.

The compiler could generate C statements so that you could take advantage  of
the advancements in technology in the compiler, assembler and linker,  without
having to deal, directly, with those parts of the system.  I know  this would
make the compiler operation more unwieldy.  More operations,  therefore it
would take longer, but theoretically the source is bugless,  so you would
expect the output of this "emulation compiler" to either  succeed or fail.
You'd run the emulator on the program only once.  The  beauty of producing
assembler (C would be better here), is that if it  didn't work first time, a
programmer type could patch it up in source and get it running.  I'm really
intrigued by this idea.  Where did you hear about it, do you remember?

My knee-jerk reaction initially was to file the idea, but then I got to
wondering why no one had done it.  There were many emulators out there for
various source machines.  Why were none of them compilers?  Another idea I had
was to contact the dude in Oz (or is he a Kiwi?) that wrote IBeM.  He already
has the emulation working except for the parallel and serial ports. It would
appear that he reads the IBM object code, deciphers it and runs a routine, or
simply does a MOVE using an opcode lookup table, as you suggest; an
interpreter.  If he instead simply wrote out an instruction in ASCII to do the
call or move instead, using a shared library of his emulation routines, he'd
basically have it.  The end user would also have to have an assembler or C
compiler, however.  This type of approach has got to produce faster emulation,
if it is possible.  I believe it to be.

Anyway, that's what I had in mind Charlie.  I really do appreciate your
feedback on this.  Care to comment on any directions you think could be
followed?  Do you know anyone with enough venture capital to fund the further
development of this concept? ;-)  Do you think I should approach the author of
IBeM (cute name) directly?  Or, should this private discussion we've been
having be moved to Usenet?  Thanks again Charlie!   I appreciate you man.

csj

Mon Mar 4 15:36:53 1991  From: Charlie Gibbs [218] Subject : Emulators

     I can't remember where I first heard of the idea.  The converted code
won't necessarily be smaller than the original, depending on the relative sizes
of corresponding machine instructions on both machines.  However, if you could
make the compiler really smart it might be able to recognize certain sequences
of instructions and replace them by sequences designed to accomplish the same
thing more efficiently.  For instance, since the 8080 doesn't have a multiply
instruction it needs to fake it with a bunch of adds and shifts.  A smart
compiler, if it could recognize such a routine, could replace it with a single
68000 multiply instruction and see huge savings.

     I'd stay away from calling subroutines; the overhead could kill you.

     The copyright issue could be a sticky one, although I can't see any
problems if you run the converter on your own copy of the emulated software and
don't try to sell the result.  It would no doubt be classified as a "derivative
work".

     Perhaps it might be interesting to throw this discussion out to Usenet. It
won't be a trivial job, which is probably why we haven't seen it done
elsewhere.  Remember that a straight machine-code emulation duplicates all the
register fiddling that is required by the target machine's architecture (and
the 80x86 family needs a LOT of register fiddling).  This code is replaced by
the 680x0's own internal fiddling if you're re-compiling source code.  One way
of looking at it is to decompile the original machine code, then recompile it
for the new machine.

     Interesting stuff...  CJG

     ------------ End of included text of messages. -----------

Both Charlie Gibbs and myself frequent this newsgroup and look forward to any
additions to this discussion with which others may respond.  Sorry that  the
posting is so long but I felt there was little enough chaff contained  in the
messages to warrant including all of them.

csj

The hard way is usually the disguised easy way, you take your choice. Usenet:
a542@mindlink.UUCP Phone: (604)853-5426 FAX: (604)854-8104