Path: utzoo!attcan!uunet!zaphod.mps.ohio-state.edu!think.com!paperboy!meissner From: meissner@osf.org (Michael Meissner) Newsgroups: comp.arch Subject: Re: RISCizing a CISC processor Message-ID: Date: 11 Dec 90 17:34:40 GMT References: <9012070105.AA02416@hcrlgw.crl.hitachi.co.jp> <1200@dg.dg.com> <1311@inews.intel.com> Sender: news@OSF.ORG Organization: Open Software Foundation Lines: 89 In-reply-to: dlau@mipos2.intel.com's message of 10 Dec 90 20:22:12 GMT In article <1311@inews.intel.com> dlau@mipos2.intel.com (Dan Lau) writes: | In article <1200@dg.dg.com> uunet!dg!lewine writes: | >In article <9012070105.AA02416@hcrlgw.crl.hitachi.co.jp>, joe@hcrlgw.crl.hitachi.co.JP (Dwight Joe) writes: | > ***HOWEVER***, the advantage of RISC is moving work from | > runtime to compile time. The big speedup comes from compiler | > work not hardware. At Data General we have modified some of | > the compilers for our CISC MV-series to compile simple code | > instead of using instructions like WEDIT. This has produced | > major performance enhancements because a compiler can generate | > special case code. | | I don't understand the comment above about the MV-series compilers. | Are you saying that after DG changed the MV-series compilers to generate | simple code, there was a major performance improvement (over the complex | code)? Or are you saying that "because a compiler can generate special | case code" (i.e., very complex instructions like WEDIT), there was a | major performance enhancement over the simple code? | | I am confused, can you please clarify the above. Thanks. | Dan Lau Let me try to clarify some things. Only certain compilers actually generated WEDIT (notably Cobol and PL/1, possibly Basic). The {,W}EDIT instruction was actually a secondary instruction set that read a bytestream to figure out how to convert a number to a stream of bytes (I'm slightly fuzzy here, because in my ten years at Data General, I never once used a WEDIT instruction). Most programs do not need the complex interpretation, since the format is known at compile time. On these programs, the code generator would issue multiple simple instructions instead of WEDIT. I believe for some machines at least, WEDIT was removed, and the kernel would then simulate it if a WEDIT was actually used (old program, etc.). While I'm talking about the MV, let me expound on a successful way the MV was extended, and an unsuccessful way. For those of you who have never looked at the DG Nova/Eclipse/MV instruction set, there are 4 integer registers (on all versions), and 4 floating point registers (on the Eclipse and MV/Eclipse). Only two of the integer registers can be used as index registers. On the MV/Eclipse, the 4 stack values (stack pointer, frame pointer, stack base, and stack limit) are also held in registers, but there is no direct addressing mode to use these registers. The standard save instruction puts the frame pointer in one of the index registers. Needless to say, this put a crimp in code generation, particularly in doing things like: p1->field1 = p2->field1; p1->field2 = auto_var; p1->field3 = p2->field3; So we in Langauges, requested an addition to the instruction set that would give frame pointer relative addressing (and possibly stack pointer as well). For existing machines in the field, there was a slight penality to the upgrade, but one of the machines (the MV/7800 if I remember correctly) that was under development, but not yet shipped could only do this instruction in 27 clocks (ie, it would be faster on that machine to do a push, load register, whatever, pop). So, this feature had to be scrapped, because the hardware people didn't/couldn't respin the silicon. Sigh.... The more successful upgrade was how the sine, cosine, etc. instructions were added. For the high end machines (MV/10000 with FPU, MV/20000, and presumably MV/40000), the machine would have a hardware accelerator which would do the operation, but it was important to have the same binaries run on the low end machines as well with as little slowdown regarding the old method of calling library functions. The architect noticed that the standard long call instruction had a left over bit that was easy for the microcode to access, so the new instructions had the format: <16 bit opcode> <32 bit address of emulator> <16 bit subopcode> (on the long call instruction, the <16 bit subocode> field was the argument could that was pushed on top of the stack, so the return instruction could know how many words to pop off). This way, you did not have to trap to the kernel to implement the instructions, which can be much too slow, but instead just called the emulator directly. -- Michael Meissner email: meissner@osf.org phone: 617-621-8861 Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142 Considering the flames and intolerance, shouldn't USENET be spelled ABUSENET?