Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!uunet!seismo!sundc!pitstop!sun!decwrl!pyramid!prls!mips!earl
From: earl@mips.UUCP (Earl Killian)
Newsgroups: comp.arch
Subject: Re: Horizontal pipelining
Message-ID: <862@gumby.UUCP>
Date: Sun, 1-Nov-87 06:08:21 EST
Article-I.D.: gumby.862
Posted: Sun Nov  1 06:08:21 1987
Date-Received: Thu, 5-Nov-87 07:47:18 EST
References: <201@PT.CS.CMU.EDU> <8801@utzoo.UUCP> <8758@shemp.UCLA.EDU> <2525@mmintl.UUCP>
Lines: 20
Keywords: multiple users

In article <2525@mmintl.UUCP>, franka@mmintl.UUCP (Frank Adams) writes:
> Suppose, for example, that your instruction processor has four stages.  With
> conventional pipelining, that means that four consecutive instructions from
> the same program are at some stage of execution at the same time.
> 
> Instead, why not have four different execution threads being performed
> simultaneously?

This is an old idea; it was done on the CDC 6600 i/o processors.  More
recently it was tried on the HEP, which wasn't very successful.

One of the problems is that you end up with a multiprocessor built out
of slow uniprocessors, which is rarely successful when equivalent power
uniprocessors are available.

Besides N register sets, you also need N times larger caches to handle
N simultaneous working sets.  This is usually more expensive than
conventional pipelining.  Or you can eliminate the cache on the theory
that you'll run other tasks while you wait for memory, thereby
providing even slower uniprocessors (but perhaps more of them).