Path: utzoo!attcan!uunet!zephyr.ens.tek.com!uw-beaver!mit-eddie!mintaka!yale!think!zaphod.mps.ohio-state.edu!usc!elroy.jpl.nasa.gov!peregrine!ccicpg!cci632!bsw From: bsw@cci632.UUCP (Brad Werner) Newsgroups: comp.sys.m88k Subject: Re: What the heck is "instruction folding"? Summary: speculation Message-ID: <35108@cci632.UUCP> Date: 15 Mar 90 18:15:23 GMT References: <25FAED94.24113@paris.ics.uci.edu> <132876@sun.Eng.Sun.COM> <25FE034C.24559@paris.ics.uci.edu> Reply-To: bsw@ccird1.UUCP (Brad Werner) Distribution: na Organization: Computer Consoles Inc. an STC Company, Rochester, NY Lines: 53 rfg@ics.uci.edu (Ronald Guilmette) writes: *ram@sun.UUCP (Renu Raman) writes: *>rfg@paris.ics.uci.edu (Ronald Guilmette) writes: *>>from the March 1990 issue of Unix Review (page 26): *>> In December 1989, Dolphin { Server Technology } announced the Orion, *>> a project that is to take performance far beyond today's 88k *>> processors by building a processor using the 88k instruction set *>> that is capable of executing up to eight instructions in parallel *>> to achieve a theoretical peak performance of 1000 MIPS. *> *> If an instruction does not occupy a pipeline slot, then one says " *> instruction is folded", * *Also, what the heck has "instruction folding" got to do with VLIW machines? *Hummm... maybe these fellows at Dolphin are saying that they have a VLIW *machine when in fact they have a machine that batches up groups of instruction *decodes but which still has to stuff the instructions down a single FP pipe one *at a time. Anybody else wanna partake of this wild speculation? * Yes. Wild speculation follows: "...executing eight instructions in parallel..." -> could mean that they have four FP and four IP units fed by the kind of high-volume pre-fetch which rfg describes later. Or maybe a 6/2 IP/FP ratio is more suited to their needs, I just picked a symmetric P unit division assuming that they might want to prototype with CMOS or whatever existing 88100s before getting deep into the ECL concerns. Eight in parallel could mean eight of each, but I assumed marketing types would have insisted in calling that 16 in ||. So the actual IP/FP units would be 88k code compatible, yet the VLIW fed into the pre-decode could be multi-slot 88k or whatever is convenient. Some information could be included to specify P unit scheduling which gets into folding, and related issues if the VLIW does not just specify eight 88k instructions. That would get into one class of scheduling issues in the compiler in order to deal with the multiple P pipes. If the pre-fetcher+ does some scheduling, this speculation is more interesting. This may be a side issue, but while I'm into speculation mode I don't want to forget it. Say a 'bcnd' comes down the pipe. The N field currently specifies some pipe guidelines--whether to execute the next instruction unconditionally or not. The VLIW containing a bcnd could have additional information specifying which slots of this VLIW and the next W to execute unconditionally (too CISCy for you?). For either case (the two previous paragraphs), classic 88k instructions don't enter the same of eight pipes so I believe the general term instruction folding is appropriate. Moot side note: Didn't Norsk come up with the term (and patent)? I'd better stop now before I get too far down the pipe and the queues have to be flushed. -Brad Werner; USENET: ...!cci632!ccird1!bsw; these are my speculations.