Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!cs.utexas.edu!tut.cis.ohio-state.edu!ucbvax!hplabs!hp-pcd!hpcvra!randys From: randys@hpcvra.CV.HP.COM (Randy Stockberger) Newsgroups: comp.sys.handhelds Subject: Re: Re: HP-48 Emulator Message-ID: <21580014@hpcvra.CV.HP.COM> Date: 10 Mar 90 18:34:25 GMT References: <6724@hydra.gatech.EDU> Organization: Hewlett-Packard Co., Corvallis, OR, USA Lines: 137 With the current discussion about emulating the Saturn CPU on a PC I was curious about what some of the problems might be in doing that. Since the PC CPU and the Saturn CPU have slightly different architectures I feel it is safe to assume that there will be some subtle problems in the emulation. I have not given the task enough consideration to understand what some of these problems might be. However, I did code up a couple of instructions which I arbitrarily decided are representative of the task of emulating Saturn. First of all I made some assumptions about how to structure the program. I assumed that the 64 bit CPU registers are encoded as one nibble per byte, low nibble in low memory. This allows the field select instructions to operate without having to pack and unpack nibbles. The downside of this is that byte aligned BCD instructions have to be carried out a nibble at a time. I think that if a packed nibble format were used then byte aligned BCD operations could work on two nibble per loop, but the odd nibbles would cause considerable trouble. Also, if the nibbles were packed it would be easier to do non-BCD operations. For organizing memory I made the opposite decision, nibbles are packed, two per byte. Using this organization is is possible to squeeze the entire MegaNibble of Saturn address space into a DOS program, but probably does not leave enough extra space to accomodate the Saturn emulator code. I have some real doubts about being able to emulate a complete Saturn and memory space on an 8086/80286 class CPU. I also assume that the program counter is just a word. This is just for simplicity and is really not an acceptable design decision. In a production quality emulator you would have to design a program counter data structure that would be efficient to manipulate, and would work with the 8086 segmented architecture. If the emulator were going to run on a 386 in protected mode, e.g. with a large directly addressable memory space, it would probably be best to organize memory with one nibble per byte. Also, on a machine with 32 bit words, programming the PC, D0 and D1 registers would be much easier. Now, for a couple of code samples. The first instruction was selected to represent the arithmetic portion of the instruction set. ; Decimal mode instruction 'A=A+C W' mov DI,offset (RegA-1) ; 2 Set up pointers to source and mov SI,offset (RegC-1) ; 2 destination registers. mov CX,16 ; 2 Count of the number of nibbles to add. clc ; 2 Make sure carry is OK at the start. AddAC: inc si ; 2 inc di ; 2 mov al,[si] ; 4 Fetch the first operand adc al,[di] ; 6 Add in the second operand aaa ; 4 Make it a BCD addition mov [di],al ; 2 Store the result loop AddAC ; 11+m Loop back for the next nibble mov byte ptr RegCarry,0 ; 2 Assume we had no carry jnc DoneCarry ; 11+m/3 Assumption was right dec byte ptr RegCarry ; 6 Assumption was wrong, set carry DoneCarry: jmp NextInstruction ; 7+m/3 ; Execution time is ( 31 * 16 ) + 17 = 496 + 28 = 524 cycles. ; Assuming a 20 MHZ machine this is 26.2 micro seconds. On a 48SX this ; instruction takes 17 cycles (maybe 18, I don't have the docs here) and ; 17 cycles at 2 MHZ is 8.5 us. A ratio of .324, the emulator is about ; 1/3 as fast. ; I selected the GOC instruction since I figured it would be one of the ; easiest and most efficient to emulate. Again, if I were writing a ; production quality emulator the program counter would have to reach the ; entire Saturn address space and this code would be more complicated. ; ; GOC instruction. Assumes BX == PC?? ( 16 bits == 20 bits ??? ) cmp RegCarry,0 ; 5 See if there is any thing to do jz NextExit ; 7+m/3 No, easy out. mov si,bx ; 2 shr si,1 ; 3 mov al,[si] ; 4 Fetch low nibble jc OddPC ; 7+m/3 EvenPC: mov ah,al ; 2 Save second nibble in ah. mov cl,4 ; 4 shr al,cl ; 7 Shift high nibble to low. mov cl,4 ; 4 shl ah,cl ; 7 or al,ah ; 2 al == offset for the GOC. jmp short AddOffset ; 7+m OddPC: and al,0Fh ; 2 Mask off low nibble. mov ah,[si+1] ; 4 Fetch next nibble. and ah,0F0h ; 2 or al,ah ; 2 AddOffset: mov ah,0 ; 4 add bx,ax ; 2 NextExit: jmp NextInstruction ; 7+m ; If I remember correctly a Saturn CPU will execute the GOC instruction ; in either 3 (if the PC is not changed) or 5 cycles. This is an ; execution time of 1.5/2.5 us. ; No Carry time == 12 cycles == 0.6 us Ratio: 2.5 ; OddPC time == 40 cycles == 2.0 us Ratio: 1.25 ; EvenPC time == 66 cycles == 3.3 us Ratio: 0.757 Now, what does all this mean? It means that on an average 20MHZ 386 under the limitations imposed by DOS will execute some of the emulated instructions slightly faster, and and most of them a little slower than the 48SX. I estimate the average emulation speed would be less than 1/2 as fast. The exact speed would, of course, depend on what instructions were in the program being executed. Is it fast enough? That is up to the person who uses the program. Could it be faster? Probably not under DOS. There are problems that I glossed over, ignored or haven't even found yet that would almost certainly make it even slower. Given a 32 bit CPU like a 68000 or a 80386 and an operating system which allows perhaps a megabyte and a half or more for the necessary data structures and code space without having to worry about segment registers it would probably be faster. However, what percentage of us are running UNIX on a 68000 with all the freedom and access that we have with our PCs? Or would the 386 and Xenix be a better choice for the host environment? A generic 4.77 MHZ PC with an 8088 CPU is probably about 1/10th the speed of the 20MHZ 386 and would emulate at 1/20 to 1/30 the Saturn speed. An 8 or 12 MHZ 80286 would, of course, be somewhere in between. I suspect that these would not be acceptable for most of us. -- Randy Stockberger randys@hp-pcd.hp.com Ma Bell: 750-3589 --