Path: utzoo!news-server.csri.toronto.edu!cs.utexas.edu!wuarchive!zaphod.mps.ohio-state.edu!van-bc!rsoft!mindlink!a547 From: Chris_Johnsen@mindlink.UUCP (Chris Johnsen) Newsgroups: comp.sys.amiga.emulations Subject: Emulator Mechanics (sorry long post) Message-ID: <4992@mindlink.UUCP> Date: 4 Mar 91 15:20:48 GMT Organization: MIND LINK! - British Columbia, Canada Lines: 282 The following messages were shared in an email environment between Charlie Gibbs and myself over the last couple of days. We decided that it would be a good idea to share these thoughts with the readership of comp.sys.amiga.emulations. The messages speak for themselves for the most part. ------------ Included text of messages ------------ Sun Mar 3 6:40:53 1991 From: Chris Johnsen [547] Subject : Emulators Hello Charlie! Long time no talk to you eh? I haven't seen you for quite some time. I was particularly fond of those after-PaNorAmA meetings at Wendy's a few years ago. I trust you recall me from back then. I have been following the comp.sys.amiga.emulations newsgroup on Usenet for quite awhile now and was quite interested in the IBeM program that is being posted about recently. In one of the near Alpha/Theta states that, being a programmer, such as yourself, one recognizes as some kind of "source"; I had an idea. Since there are not many "new" ideas spawned I wanted to get some feedback from a person with the expertise you possess. The reasons I thought of you for this private consultation are: 1) you've written an emulator - SimCPM 2) you've written an assembler - A68K 3) you have a broad knowledge of the computing field beyond lowly PC's and 4) you're such a nice guy! I was musing about how an emulator would work. I must confess I have no concrete knowledge of this. It has been said that "Not knowing how to do a job 'right' frees you to discover new and better ways of accomplishing the desired result". Freeman Patterson (a Canadian photographer), in his book Photography and the Art of Seeing, called this "thinking sideways". I believe I've thought of a way of writing an emulator that would run at least as fast on the Amiga as it would on the source machine. My questions for you are: Would you be interested in being a mentor to me in developing further this idea or in fact ascertaining whether my concept has any validity? I'm reticent to spill this on the net until I'm somewhat confident that it is indeed practicable. I have thought about the possible legal considerations of producing an emulator. There are potential commercial possibilities that one could consider also. Would you please give me just a quick, superficial rundown of the basic algorithms used in developing an emulator? I have assumed that you would read in the object module of say an IBM executable, read it by opcode or routine, decipher the intent and then call a library of glue routines to do the job that the program would have on an IBM or clone. I have no idea how the interrupt structure would be handled but know that you have done it with SimCPM. I don't want to waste your time, but I would appreciate this information. I would like to get your feedback on these questions to see if you are interested in further discussion with me on this matter. If you are not interested or don't have the time I'll certainly understand. I realize I'm asking you for more information than I'm sharing but an idea is a tiny property and I'd like to at least savor this for a while before I decide what to do with it. That is the sole reason I have not gone public with it yet. If you are indeed interested, after you tell me on the highest level how an emulator functions, I'll be able to describe this idea on some kind of comprehensible basis. I'm looking forward to your response! Thanks Charlie. csj Sun Mar 3 23:12:32 1991 From: Charlie Gibbs [218] Subject : Emulators Indeed I am somewhat short of time these days, but I wouldn't mind kicking the odd idea around without getting too wrapped up in it. I do understand your idea of "thinking sideways" and enjoy being able to do it myself from time to time. A similar way I've heard it described is that even though it's been proven that bumblebees can't fly, they don't realize this and so do it anyway. Or to put it another way, I like to write programs that are too stupid to know what they're doing, so they can do anything. Your ideas on emulation are basically in line with what's considered the standard way of doing things. A machine instruction is analyzed as to just what it's supposed to do, and appropriate code then carries out the operation. The guts of SimCPM appeared in Dr. Dobb's journal as an emulator that was meant to run under CP/M-68K. This made the job a bit easier for the original author, since the CP/M-68K system calls were quite similar to the CP/M calls that he was emulating. I had to replace this portion of code with appropriate AmigaDOS routines. In addition, I extended the code to handle the full Z-80 instruction set, since the original code could only handle the 8080 subset. Since emulating another processor in software is quite a CPU-intensive process (several machine instructions have to be executed to emulate a single machine instruction on the target machine) I tried to optimize SimCPM for speed at the expense of memory and redundant code. The overhead of a single subroutine call, plus any extraction and interpretation of arguments, would require several times as much time as a hand-picked set of instructions dedicated to a single opcode. For system calls there's easy out - as soon as I recognize what the emulated program is trying to do (e.g. read a block from a disk file), I call the corresponding AmigaDOS routine, so I/O can proceed pretty well at native speeds. Therefore, even though CPU-bound programs might run 10% as fast as they would on the target machine, I/O bound programs might get up to 50% of speed. Interrupts were easy - since most CP/M systems don't use hardware interrupts, except possibly for a very few hardware-dependent programs, I simply didn't worry about them. Software interrupts (the RST instruction) were a snap on the other hand, since they're basically a special-purpose subroutine call. The Intel 8080 is fairly easy to emulate because the opcode is uniquely determined by the first byte of the instruction. Some instructions might have register numbers encoded in a few bits of that first byte, but I just treat them as special cases. To decode the byte, I just multiply its value by 4 (shift left 2 bits) and use the result as an index into a table of 256 pointers to the actual emulation routines. Since there are 64 possible MOV instructions (well, 63 because one of the bit combinations is actually HLT) I actually have 63 MOV emulations, one for each combination of registers. This means that I don't have to do any register decoding, since each routine consists of a dedicated 68000 MOVE instruction, followed by a jump back to the main emulation loop. Lots of almost-redundant code, but it's about as fast as it can get. This is getting kind of long-winded. I'd be interested in hearing any ideas you may have; although I wouldn't have time to get into the programming of such stuff, I'm willing to act as a sounding board. Talk to you soon... CJG Sun Mar 3 23:18:48 1991 From: Charlie Gibbs [218] Subject : Emulators Another approach I've heard of is to "compile" the code to be emulated into native machine code. This would involve a front-end program which would read the target machine's program and analyze the instructions. For instance, if the "compiler" detects an instruction that does a move between two of the emulated machine's registers, it would simply generate a move instruction in the emulating machine's code. It could generate either a translated assembly language source file or a machine-language file ready to load into the emulating machine. This would require the "compilation" process to be run once on the program to be emulated, and you'd then run the output of this "compiler." There are special tricks to consider here, such as resolving addresses - you couldn't just copy the memory addresses across because the emulated routines would likely be a different size. It might be easier to generate a label (e.g. Axxxx where xxxx is the hex address in question) in an assembly source file and let the emulating machine's assembler sort it all out. I've never actually seen this process in action, but it's another possibility. --CJG Mon Mar 4 12:38:31 1991 From: Chris Johnsen [547] Subject : Emulators Thanks a lot for your effort in explaining SimCPM to me man. As you describe it, it would seem that I had intuitively understood the basic concepts. I would think that interrupts would be the hardest part to get down to reliable operation. What I had in mind, while thinking about this, in general terms, was an emulator that was non-specific as to the machine, therefore I was attempting to contemplate it handling say IBM, Mac, (hey it may even be possible to deal with Amiga emulation!) and Atari ST on the Amiga and imagining what the various architectures would require. All this on a very abstract level. Your second message hit the nail on the head! I got bogged down at about the level you describe in your first message. Lots of details to be sought and worked out. Gee, I'm really not even available to code another program just yet anyway. I was giving the whole concept a rest when, what I thought of, kind of sideways (lazy minds tend to look for an easier way around an obstacle, sometimes unconsciously, even though this can lead to harder, though more elegant solutions to problems), was to read the opcodes from the "source executable" of the emulated machine, producing an assembly listing of the program. This I imagine would be a two pass process, sort of like a C compiler, followed by an assembler's two passes, and finished off with a linker. I thought that, if the compiler was "intelligent" enough, the output, though likely larger, would be much faster than the common "interpreter type emulator". I had never heard of such an idea and since there are none out there, wanted to discuss this with you. I have developed the idea no further than this in essence. I did think of a few other considerations however. If one could, indeed, compile an executable image of say Lotus 123 from the IBM into a program which, on a base Amiga, could run at half speed, or on a A2500 or A3000 at twice the speed, it would be a viable alternative, besides being a neat toy. However, the standalone program generated would likely infringe on the copyright of Lotus because the Amiga executable would actually be Lotus 123. Take WordPerfect for instance. The latest version available is 4.1 or just a micro-point higher, no problem, get hold of the IBM version 5.1, I believe it is, and compile it and you have something some other people are wailing for. Of course the rebuttal (I can hear you thinking?) is that, if a person owns WordPerfect he has an inalienable right to run it. Run it on an IBM. Run it on a clone. Run it through an Amiga compiler. You know, if it's for personal use, etc. As to the increase in size of the "compiled emulation" program, I have a couple of ideas. First, the executable, though larger, would be standalone, except for any support libraries. This doesn't mean that this "form" of emulator, more like a "translator of executables", would be any less efficient than the "interpreter type". Perhaps more memory efficient in a couple of possible ways. Since the interpretation section of the program is in the compiler, and the source executable is not required at runtime, memory usage may well be less with a "compiled emulation". The second concept is to use link libraries which would bind only the emulation routines required to the final program. Possibly a combination of bound-at-link-time modules of less frequently used routines and a shared library of essential routines all programs would need. A solely link library approach would leave this concept open to claims that pirates could produce "warez" that need no extra code or setup to work. Of course, pirates appear to be capable of ripping anything off anyway! This "compiled emulation" would, given sufficient memory and CPU speed/efficiency, allow the running of multiple programs. Both emulated programs and standard Amiga programs. Through the use of a shared library more than one emulating program could be run without the overhead of multiple "emulation interpreters" resident in memory. The compiler could generate C statements so that you could take advantage of the advancements in technology in the compiler, assembler and linker, without having to deal, directly, with those parts of the system. I know this would make the compiler operation more unwieldy. More operations, therefore it would take longer, but theoretically the source is bugless, so you would expect the output of this "emulation compiler" to either succeed or fail. You'd run the emulator on the program only once. The beauty of producing assembler (C would be better here), is that if it didn't work first time, a programmer type could patch it up in source and get it running. I'm really intrigued by this idea. Where did you hear about it, do you remember? My knee-jerk reaction initially was to file the idea, but then I got to wondering why no one had done it. There were many emulators out there for various source machines. Why were none of them compilers? Another idea I had was to contact the dude in Oz (or is he a Kiwi?) that wrote IBeM. He already has the emulation working except for the parallel and serial ports. It would appear that he reads the IBM object code, deciphers it and runs a routine, or simply does a MOVE using an opcode lookup table, as you suggest; an interpreter. If he instead simply wrote out an instruction in ASCII to do the call or move instead, using a shared library of his emulation routines, he'd basically have it. The end user would also have to have an assembler or C compiler, however. This type of approach has got to produce faster emulation, if it is possible. I believe it to be. Anyway, that's what I had in mind Charlie. I really do appreciate your feedback on this. Care to comment on any directions you think could be followed? Do you know anyone with enough venture capital to fund the further development of this concept? ;-) Do you think I should approach the author of IBeM (cute name) directly? Or, should this private discussion we've been having be moved to Usenet? Thanks again Charlie! I appreciate you man. csj Mon Mar 4 15:36:53 1991 From: Charlie Gibbs [218] Subject : Emulators I can't remember where I first heard of the idea. The converted code won't necessarily be smaller than the original, depending on the relative sizes of corresponding machine instructions on both machines. However, if you could make the compiler really smart it might be able to recognize certain sequences of instructions and replace them by sequences designed to accomplish the same thing more efficiently. For instance, since the 8080 doesn't have a multiply instruction it needs to fake it with a bunch of adds and shifts. A smart compiler, if it could recognize such a routine, could replace it with a single 68000 multiply instruction and see huge savings. I'd stay away from calling subroutines; the overhead could kill you. The copyright issue could be a sticky one, although I can't see any problems if you run the converter on your own copy of the emulated software and don't try to sell the result. It would no doubt be classified as a "derivative work". Perhaps it might be interesting to throw this discussion out to Usenet. It won't be a trivial job, which is probably why we haven't seen it done elsewhere. Remember that a straight machine-code emulation duplicates all the register fiddling that is required by the target machine's architecture (and the 80x86 family needs a LOT of register fiddling). This code is replaced by the 680x0's own internal fiddling if you're re-compiling source code. One way of looking at it is to decompile the original machine code, then recompile it for the new machine. Interesting stuff... CJG ------------ End of included text of messages. ----------- Both Charlie Gibbs and myself frequent this newsgroup and look forward to any additions to this discussion with which others may respond. Sorry that the posting is so long but I felt there was little enough chaff contained in the messages to warrant including all of them. csj The hard way is usually the disguised easy way, you take your choice. Usenet: a542@mindlink.UUCP Phone: (604)853-5426 FAX: (604)854-8104