Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!uunet!seismo!sundc!pitstop!sun!decwrl!labrea!jade!ucbvax!husc6!ut-sally!im4u!rajiv
From: rajiv@im4u.UUCP (Rajiv N. Patel)
Newsgroups: comp.arch
Subject: Re: Horizontal pipelining
Message-ID: <2389@im4u.UUCP>
Date: Tue, 3-Nov-87 17:00:04 EST
Article-I.D.: im4u.2389
Posted: Tue Nov  3 17:00:04 1987
Date-Received: Sat, 7-Nov-87 06:35:25 EST
References: <201@PT.CS.CMU.EDU> <8801@utzoo.UUCP> <8758@shemp.UCLA.EDU> <2525@mmintl.UUCP>
Reply-To: rajiv@im4u.UUCP (Rajiv N. Patel)
Organization: U. Texas CS Dept., Austin, Texas
Lines: 61
Keywords: multiple users, hardware utilization.

Summary:Horizontal Pipelining may be good after all.

Distribution:World


In article <2525@mmintl.UUCP> franka@mmintl.UUCP (Frank Adams) writes:
>I had an idea some time ago that I'm surprised I've never seen discussed.
>Suppose, for example, that your instruction processor has four stages.  With
>conventional pipelining, that means that four consecutive instructions from
>the same program are at some stage of execution at the same time.
>
>Instead, why not have four different execution threads being performed
>simultaneously?  This eliminates the dependency checks and latency delays
>inherent in "vertical" pipelining.  (Many RISCs put these into the compiler
>instead of the architecture, but they're still there).  On a multi-user
>system with a reasonable load level, it seems to me that this should
>represent a performance improvement.  Of course, it won't look good on the
>standard benchmarks.
>
>Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
>Ashton-Tate          52 Oakland Ave North         E. Hartford, CT 06108


I seem to agree with the reasoning given by Frank. Many RISCs have placed
great emphasis on software issues like an efficient compiler, still it is
common to see that 30-50% (debatable issue) of the pipelined stages do no
fruitful work (pipeline bubbles) due to data dependencies, latency delays
and probably branches causing a pipeline flush. Now if one were to introduce
horizontal pipelining to run more than one process in order to fill this
pipeline bubbles I feel that though a single process execution rate may go
down a little bit the overall throughput rate of the processor would 
increase dramatically. Think about hardware utilization available using such
a concept (I cannot quote but have been told that hardware utilization in 
terms of active logic circuits per cycle on a chip is pretty low.)

This concept may not appeal those designers who want to get the maximum
throughput for a single process as in Super-computing problems, but definitely
would appeal to designers of general purpose chips which could be used
efficiently for control applications to designing workstations which tend
to have applications requiring many processes to be run.

Benchmarking such architectures and comparing them to normal RISC/CISC 
architectures is another big controversial issue. I have still not been able
to figure out how to compare the two, but one way to make the horizontally
pipelined architecture to look damn good is to compare hardware utilization
ratios or compare raw instructions executed per (some million) cycle(s) for any
process available to be executed.

Most of the comments I have made here are based on our studies here at
UT Austin on a computer architecture project which combines the RISC 
philosophy with the concept of Horizontal pipelining to give a hardware
efficient processor with marginal hits on single process execution rates. 
As mentioned by someone earlier on the net, the cache design for such an
architecture is the problem as it has to cache multiple instruction streams.
I feel that with progress in VLSI technology this problem will not pose to be
as serious. Fairchild CLIPPER already has a 4K byte cache chip, and a 8K byte
cache chip should probably be a decent starting point for a processor with
say 2-4 processes able to execute concurrently.

Well I have certainly raised a lot of issues here which many would like to 
criticize or comment on. Please feel free to do so, this may help us here on
our research work.


Rajiv Patel.
(rajiv@im4u.utexas.edu)