Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!seismo!sundc!pitstop!sun!decwrl!labrea!jade!ucbvax!husc6!ut-sally!im4u!rajiv From: rajiv@im4u.UUCP (Rajiv N. Patel) Newsgroups: comp.arch Subject: Re: Horizontal pipelining Message-ID: <2389@im4u.UUCP> Date: Tue, 3-Nov-87 17:00:04 EST Article-I.D.: im4u.2389 Posted: Tue Nov 3 17:00:04 1987 Date-Received: Sat, 7-Nov-87 06:35:25 EST References: <201@PT.CS.CMU.EDU> <8801@utzoo.UUCP> <8758@shemp.UCLA.EDU> <2525@mmintl.UUCP> Reply-To: rajiv@im4u.UUCP (Rajiv N. Patel) Organization: U. Texas CS Dept., Austin, Texas Lines: 61 Keywords: multiple users, hardware utilization. Summary:Horizontal Pipelining may be good after all. Distribution:World In article <2525@mmintl.UUCP> franka@mmintl.UUCP (Frank Adams) writes: >I had an idea some time ago that I'm surprised I've never seen discussed. >Suppose, for example, that your instruction processor has four stages. With >conventional pipelining, that means that four consecutive instructions from >the same program are at some stage of execution at the same time. > >Instead, why not have four different execution threads being performed >simultaneously? This eliminates the dependency checks and latency delays >inherent in "vertical" pipelining. (Many RISCs put these into the compiler >instead of the architecture, but they're still there). On a multi-user >system with a reasonable load level, it seems to me that this should >represent a performance improvement. Of course, it won't look good on the >standard benchmarks. > >Frank Adams ihnp4!philabs!pwa-b!mmintl!franka >Ashton-Tate 52 Oakland Ave North E. Hartford, CT 06108 I seem to agree with the reasoning given by Frank. Many RISCs have placed great emphasis on software issues like an efficient compiler, still it is common to see that 30-50% (debatable issue) of the pipelined stages do no fruitful work (pipeline bubbles) due to data dependencies, latency delays and probably branches causing a pipeline flush. Now if one were to introduce horizontal pipelining to run more than one process in order to fill this pipeline bubbles I feel that though a single process execution rate may go down a little bit the overall throughput rate of the processor would increase dramatically. Think about hardware utilization available using such a concept (I cannot quote but have been told that hardware utilization in terms of active logic circuits per cycle on a chip is pretty low.) This concept may not appeal those designers who want to get the maximum throughput for a single process as in Super-computing problems, but definitely would appeal to designers of general purpose chips which could be used efficiently for control applications to designing workstations which tend to have applications requiring many processes to be run. Benchmarking such architectures and comparing them to normal RISC/CISC architectures is another big controversial issue. I have still not been able to figure out how to compare the two, but one way to make the horizontally pipelined architecture to look damn good is to compare hardware utilization ratios or compare raw instructions executed per (some million) cycle(s) for any process available to be executed. Most of the comments I have made here are based on our studies here at UT Austin on a computer architecture project which combines the RISC philosophy with the concept of Horizontal pipelining to give a hardware efficient processor with marginal hits on single process execution rates. As mentioned by someone earlier on the net, the cache design for such an architecture is the problem as it has to cache multiple instruction streams. I feel that with progress in VLSI technology this problem will not pose to be as serious. Fairchild CLIPPER already has a 4K byte cache chip, and a 8K byte cache chip should probably be a decent starting point for a processor with say 2-4 processes able to execute concurrently. Well I have certainly raised a lot of issues here which many would like to criticize or comment on. Please feel free to do so, this may help us here on our research work. Rajiv Patel. (rajiv@im4u.utexas.edu)