Path: utzoo!utgpu!water!watmath!clyde!ima!necntc!ames!lamaster From: lamaster@ames.arpa (Hugh LaMaster) Newsgroups: comp.arch Subject: Re: Motorola 88000 and others Keywords: mc88000 RISC parallel-units Message-ID: <7016@ames.arpa> Date: 6 Apr 88 23:29:23 GMT References: <1168@csun.UUCP> <323@m3.mfci.UUCP> Reply-To: lamaster@ames.arc.nasa.gov.UUCP (Hugh LaMaster) Organization: NASA Ames Research Center, Moffett Field, Calif. Lines: 36 In article <323@m3.mfci.UUCP> mfci!colwell@uunet.UUCP (Robert Colwell) writes: >In article <1168@csun.UUCP] sef@csun.UUCP (Sean Fagan) writes: >]I was reading in some trade rag (sorry, forget which) about the MC88000, >]which uses a 'scoreboard' to have up to 3 instructions executing >]simultaneously (plus whatever pipelining the thing has). Also, someone >achieves the performance you require. Vector machines like those you >mention above do indeed have parallelism built into their architecture, >but the way they invoke it at run time is to execute vector instructions, >which can only do repetitive operations on aggregate data sets. The Not to beat a dead horse too hard, but these are two different kinds of parallelism in the hardware, and some machines, such as the Cyber 205 for example, use both simultaneously. Specifically, vector instructions are memory to memory instructions on the 205, and scalar instruction issue continues as long as there are no conflicts with vector instructions caused by memory references AND as long as there are no conflicts with previously issued scalar instructions because of register references. So these two different kinds of parallelism are taking place simultaneously, and, in fact, performance on vectorized code with short vectors depends on the hardware preparing additional descriptors for the vector units while they are operating on a previously issued instruction. The ETA-10 is the same, and the Cray machines, which use vector registers also are similar. Register to register instructions continue to be issued until there is a register conflict. This is not to knock some other architectures such as Multiflow. The Multiflow machine has a certain elegant simplicity because it doesn't need the vector part - it vectorizes using parallel functional units, which also work with scalar operands. Architectures like Multiflow require a very sophisticated compiler, and Multiflow takes it even further by optimizing outside of basic blocks, something neither the Cray nor CDC/ETA compilers do.