Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!ucsd!ucbvax!hcrlgw.crl.hitachi.co.JP!joe From: joe@hcrlgw.crl.hitachi.co.JP (Dwight Joe) Newsgroups: comp.arch Subject: RISCizing a CISC processor Message-ID: <9012070105.AA02416@hcrlgw.crl.hitachi.co.jp> Date: 7 Dec 90 01:05:45 GMT Sender: daemon@ucbvax.BERKELEY.EDU Lines: 77 I would like some input on the following idea to extend the life of CISC processors. Consider a hypothetical machine: IM 68386C (CISCized). First, determine the dynamic instruction profile of the target mix. If the target is engineering programs, then determine the dynamic frequency of all instructions. (A LOAD with indirect addressing and a LOAD with direct addressing are considered different instructions in the context of this posting.) Then, rank the instructions from highest frequency to lowest. Exclude I/O instructions. Suppose that there are a total of "n" non-I/O instructions. Suppose that I[n] is the instruction with the highest frequency and that I[1] is the instruction with the lowest frequency. The ranking might look something like the following: instruction dynamic frequency I[1] 22% I[2] 8% . . . . . . I[n - 1] 0.002% I[n] 0.001% In a CISC chip, there is a certain redundancy. In other words, some of the complex instructions can be written in terms of the simpler instructions. An instruction to move a block of data from one place in memory to another place can be replaced by a loop of simpler LOAD and STORE instructions. Now, from the ranking of the instructions, determine the smallest "i" such that all I[j] with "j > i" can be written in terms all I[k] with "k <= j". Designate the set of the first i instructions from the above ranking to be the "RISC Set". Designate the set of I/O instructions to be the "I/O Set". Relabel "i" to be "M", the minimal number. Designate the remaining instructions from the above ranking to be the "CISC Set". Now, using timing analysis, estimate the performance of implementing the RISC Set and the I/O Set in hardware and implementing the CISC Set as subroutines in a microcode store. These subroutines are written with instructions from the RISC Set. Whenever an instruction from the CISC Set is encountered in the instruction stream, it causes a trap to the appropriate subroutine in the microcode store. Essentially, what we have is a RISC machine with some subroutines coded into ROM. There might need to be additional registers over and above those in the programmer's model in the IM 68386C in order to maintain information like the following: (1) the processor is executing instructions in a subroutine in microcode and is not executing instructions in the normal instruction stream from main memory (2) the address of the current byte of memory and the destination to which the byte is transferred by a CISC Set block-move instruction (3) etc. Designate these additional registers "Extra Registers". Naturally, they would be saved just prior to the servicing of an interrupt. The great thing about the IM 68386R (RISCized) processor is that super-scalarizing it will be no harder than for a RISC processor because, we now essentially have a RISC processor (one with subroutines microcoded to handle CISC Set instructions). We will only be super-scalarizing the RISC Set, _not_ the full set of the IM 68386C. The other great thing is that the IM 68386R is upward compatible with the IM 68386C and can use its large installed base of programs. By the way, IM 68386C is a labeling derived from 68xxx (Motorola = M) and xx386 (Intel = I). Brought to you by Super Global Mega Corp .com