Path: utzoo!attcan!uunet!microsoft!alonzo From: alonzo@microsoft.UUCP (Alonzo Gariepy) Newsgroups: comp.sys.handhelds Subject: HP28S PROCESSOR NOTES (4 of 4) Keywords: hp28s, cpu, processor, assembler, architecture, memory Message-ID: <9085@microsoft.UUCP> Date: 19 Nov 89 18:59:57 GMT Organization: Microsoft Corp., Redmond WA Lines: 442 HP28S MACHINE CODE USERS GUIDE Version 1 Copyright (C) 1989, Alonzo Gariepy ================================================================ HOW TO ENTER AND RUN MACHINE CODE PROGRAMS Inline Machine Code and SYSEVAL =============================== How do you execute machine code programs? There are two ways. The first is to place the program at a known location in memory and jump to it using SYSEVAL. The second is to wrap the program up as an inline machine code object (IMC). The second approach is best because you can store an IMC in any variable or function and you don't have to be concerned about its location as it gets moved around in memory. It also executes considerably faster. When you first open your calculator, there is no way to create IMCs, so you have to resort to SYSEVAL at least once to start. Since we're going to be writing lots of programs, we want an easy way to enter them. This section is an explanation of the techniques you can use. The easiest way to create an IMC is to take an object with the same structure, such as a string, and change its type. As an example, let us examine how the string "hello" and the following machine code program are stored in memory. Drop: 174 add 5,d1 ; drop top of stack E7 inc.a d ; increment free stack 142 move.a @d0,a ;\ 164 add 5,d0 ; return to system 808C jmp @a ;/ The string "hello" is stored in memory as E4A20F00008656C6C6F6, while the IMC, Drop, is stored as 69C2041000174E7142164808C. The objects are structured in the following way: type length data 02A4E 0000F 8656C6C6F6 "hello" 02C96 00014 174E7142164808C 174 E7 142 164 808C The type is an address that gets stored backwards. The length of the object includes everything but the type and is also stored backwards. Since the ASCII for "hello" is 68 65 6C 6C 6F, it is apparent that the HP stores characters with their nibbles swapped. The type and length are each 5 nibbles long, and the data part is (length - 5) nibbles long. The next step is to create a string object that has the right data for our machine program: ________________________________________________________________ First let's do it manually so you understand the process. We take the nibbles of the program two at a time, swap they're order, convert them to characters (using CH), and concatenate them into a string: CH << B->R CHR >> #71 CH #E4 CH + #17 CH + #24 CH + #61 CH + #84 CH + #80 CH + #0C CH + ________________________________________________________________ This can be done automatically by the function HEXIFY which takes the machine code in a string and does the conversion. HEXIFY [6AC9] << "" SWAP 1 OVER SIZE FOR j "#" OVER j 1 + DUP SUB + OVER j j SUB + "h" + STR-> B->R CHR ROT SWAP + SWAP 2 STEP DROP >> "174E7142164808C" HEXIFY ________________________________________________________________ Both methods yield the rather unusual string, "q..$a. .", which is represented internally as E4A2051000174E7142164808C0. It is almost exactly what we want for our IMC object. Since strings must have an even number of nibbles an extra zero gets added at the end of the program. All that's left to do is make the string into an IMC by changing E4A20 to 69C20. Changing Object Type ==================== So now let's write a little program that will take an object on the stack and change its type: 143 MOVE.A @D1,A ; new type integer 132 SWAP A,D0 169 ADD 10,D0 146 MOVE.A @D0,C ; contents of integer 132 SWAP A,D0 174 ADD 5,D1 ; pop stack E7 INC.A D ; free space 143 MOVE.A @D1,A ; object 132 SWAP A,D0 144 MOVE.A C,@D0 ; set new type 132 SWAP A,D0 142 MOVE.A @D0,A ;\ 164 ADD 5,D0 ; return to system 808C JUMP @A ;/ First we use HEXIFY to create a string with the code in it. "143132169146132174E7143132144132142164808C" HEXIFY resulting in: "A.#a.d1.G~A.#A.#A.F.." This string must now be turned into an IMC. Since this very string contains the code to do that, we want to run it on itself. At this point, the only way to do so is with SYSEVAL. ________________________________________________________________ First let's do it manually so you understand the process. To use SYSEVAL we follow these four steps. 1. Put the code in a known location. 2. Put the address of this location in another location. 3. Type in any arguments the program needs. 4. Type in address of the second location and SYSEVAL 1. The first variable in the HOME directory always goes at the top of memory, so we'll store the code string there. Since it is 42 nibbles long the code will begin at #CFFD6 (#D0000 - 42). 2. Now we need to put a copy of the address #CFFD6 somewhere. It turns out that the most convenient place is right inside the program just after the return. This will make the program 6 nibbles longer (HEXIFY pads it to an even 48) so our address becomes #CFFD0. "143132169146132174E7143132144132142164808C0DFFC" HEXIFY results in: "A.#a.d1.G~A.#A.#A.F....." So to complete steps 1 and 2 type: HOME 'CHT' DUP PURGE STO 3. We are changing the program itself to an IMC so we put these arguments on the stack: 'CHT' RCL #2C96 4. The address #CFFD0 is store just before the padding 0 at the top of memory, so we type: #CFFFA SYSEVAL CHT is magically changed to a System Object. Here are all the steps summarized as a program. MCHT [8B0C] << "143132169146132174E7143132144132142164808C0DFFC" HEXIFY HOME 'CHT' DUP PURGE STO 'CHT' RCL #2C96h #CFFFAh SYSEVAL DROP >> ________________________________________________________________ Caveat ====== The CHT function is dangerous because it doesn't do any error checking. If you call it with the wrong arguments or too few arguments you may lose memory. This problem is easily solved. There are only 12 type conversions that will work and not all of them make sense to do. They are: string <-> #integer list <-> program string <-> IMC list <-> algebraic IMC <-> #integer program <-> algebraic Here is a program that creates three useful type conversion functions: MTCF [3321] << << IF DUP TYPE 8 != OVER ->STR 1 1 SUB "<<" != OR THEN ABORT END #2A96h CHT >> 'PGM->' STO << IF DUP TYPE 5 != OVER 1 GET ->STR "<<" != OR THEN ABORT END #2C67h CHT >> '->PGM' STO << IF DUP TYPE 2 != THEN ABORT END #2C96h CHT >> '->IMC' STO 'PGM->' '->PGM' '->IMC' 1 3 START DUP RCL 1 ->LIST LIST-> DROP PGM-> 'CHT' DUP2 POS SWAP RCL PUT ->PGM SWAP STO NEXT 'CHT' PURGE >> What we end up with are the following functions: ->ICM converts a string to inline machine code PGM-> converts a program to a list ->PGM converts a list to a program We need all three of these to write a function that makes machine code programming practically effortless. Simple Machine Code Programming with PGM ======================================== PGM [71C6] << 1 + OVER PGM-> SWAP DUP2 GET HEXIFY ->IMC PUT ->PGM SWAP ->PGM DROP >> Here is an example of using PGM to create a peek program: PIGT [4919] << RCWS SWAP 64 STWS #0h OR SWAP STWS "13210314313016914613615671301691547113132142164808C" >> 'PIGT' RCL 9 PGM 'PIG' STO The arguments to PGM are a program and the position in the program of the machine code string, in this case the 9th position. This technique doesn't work if the string is inside an IF, a loop, or another set of << >> brackets. If you need to put other stuff before an IMC in a program, I recommend doing only the minimum necessary to check and prepare arguments for the IMC. ================================================================ STYLE CONVENTIONS FOR HP28S MACHINE CODE As in writing, use blank lines to divide your code into paragraphs that express a single thought. As in writing, if a paragraph gets too long, break it up to increase the amount of white space. Set labels apart and make them descriptive. Just as footnotes* are more effective than parentheses, documentation is more effective than inline comments. I recommend that you type programs entirely in lower case, except for labels, hex constants, and comments. Hex constants should be typed in upper case, and labels in mixed or all upper case. If a comment is close to being a sentence, capitalize the first letter and end the comment with a period. Certain forms of some instructions (marked with ***) have default field specifiers. I suggest you omit the field suffix in these cases. Instructions without a default should always be written with a field suffix. In cases where there could be confusion, you might as well put the suffix on all instructions. At the moment, we have no assembler program to ensure that your source corresponds to its machine code or to enforce the syntax rules of instructions. That is why I have made these rules as simple as possible. When I get time, I will write an assembler. When you are publishing programs, I recommend that you align the parts of each instruction on successive eight character tabs: Hex Name.f Args Comments 33A0D0 move.p4 #0D0A,c ; CR/LF characters ________________________________________________________________ * Too many parenthesized comments in text can destroy both readability and understanding. Footnotes are less obtrusive and provide more leeway for comprehensive explanation. In similar fashion, inline comments distract from the code and are so small they don't explain much. Do not, as I have seen, comment every line with a restatement of what the assembler instruction is doing. Do comment what you put into registers. The actions of a program are better understood at the level of functions and high level control constructs, such as loops and ifs. I have commented every line of my sample programs only because their purpose is to teach the instruction set. ________________________________________________________________ Another sample program: I wanted to write a really fast program to find bit patterns in memory. Most processors perform register operations much more quickly than memory operations, because the memory bus is slower than the internal bus. With even a minimal amount of caching or pipelining (e.g., prefetch) this difference is quite sizable. With that in mind, I designed the following program to minimize memory operations. It turns out only about 10% faster than the brute force approach. One reason is that the speed it gains in minimizing memory reads is partially lost in the more complicated loop structure. This small improvement is probably not worth the added complexity, but it is an interesting program to read: Find: 132 swap a,d0 ;\ 120 swap a,r0 ; save d0 and b AFC swap.w a,b ; 121 swap a,r1 ;/ 174 add 5,d1 ;\ 147 move.a @d1,c ; 134 move c,d0 ; put value in b 16A add 11,d0 ; 1567 move.w @d0,c ; AF5 move.w c,b ;/ 180 sub 1,d0 ;\ 1564 move.s @d0,c ; put number of nibbles (-1) in c(s) 1C4 sub 5,d1 ;/ 143 move.a @d1,a ;\ 130 move a,d0 ; 169 add 10,d0 ; put starting address in d0 142 move.a @d0,a ; 130 move a,d0 ;/ Fetchloop: 1527 move.w @d0,a ; get next word 80DF move c,15,p ; put length in p Panloop: 91052 breq.wp a,b,Found ; full length match! 160 add 1,d0 ;\ BF4 srn.w a ; shift alignment one nibble B46 inc.s c ;/ 51F brcc Panloop ; 80CF move p,c,15 ; put length in c Zoomloop: 0D dec p ; one less nibble to match against 40E brcs Fetchloop ; no more nibbles left 910BD breq.wp a,b,Fetchloop ; partial match BF4 srn.w a ; 160 add 1,d0 ; 5FE brcc Zoomloop ;-exit if we have wrapped memory Found: 143 move.a @d1,a ;\ 132 swap a,d0 ; 169 add 10,d0 ; replace starting address with d0 140 move.a a,@d0 ;/ 20 move 0,p ; restore p 121 swap a,r1 ;\ AFC swap.w a,b ; restore b and d0 120 swap a,r0 ; 132 swap a,d0 ;/ 142 move.a @d0,a ;\ 164 add 5,d0 ; return to system 808C jump @a ;/ FINDT [8AD6] << RCWS 20 STWS SWAP #0h OR SWAP STWS "132120AFC12117414713416A1567AF518015641C414313 0169142130152780DF91052160BF4B4651F80CF0D40E910 BDBF41605FE14313216914020121AFC120132142164808C" 1 + >> then type 'FINDT' RCL 9 PGM 'FIND' STO yielding: << RCWS 20 STWS SWAP #0h OR SWAP STWS System Object 1 +>> The way you use this program is to specify a memory pattern and a place to start looking. A memory pattern is a sequence of up to 15 nibbles and a one nibble length (1-the number of nibbles in the pattern). For, instance if you want to find instances of the instruction 808C in memory, you can type: #C8083 #0 FIND The effect of the 1 + at the end of FIND is to return the address 1 higher than where the pattern was found. That way you can just keep hitting FIND to get subsequent instances. This version doesn't stop until it finds an instance or scans the entire address space. ________________________________________________________________ Good luck! Alonzo Gariepy alonzo@microsoft