Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!zaphod.mps.ohio-state.edu!mips!orac!rex
From: rex@mips.COM (Rex Di Bona)
Newsgroups: comp.arch
Subject: Re: Bus Partitioning?
Keywords: network async hypercube
Message-ID: <35399@mips.mips.COM>
Date: 2 Feb 90 19:16:35 GMT
References: <1990Jan30.174807.14657@ncsuvx.ncsu.edu> <6960003@hp-and.HP.COM>
Sender: news@mips.COM
Reply-To: rex@mips.COM (Rex Di Bona)
Organization: Basser Department of Computer Science
Lines: 57

In article <6960003@hp-and.HP.COM> panek@hp-and.HP.COM (Jon Panek) writes:
>I think there might be an advantage in taking the inherently simpler
>approach proposed in the basenote.  So far, most of the responses have
>quickly extrapolated to the NxM cross-bar architecture.  While this is
>obviously the most general-purpose and most flexible one, it also incurs
>the highest implementation cost.

Quite true, there are other ways of connecting, such as perfect shuffle,
or its topological equivalents.
>
>By having a single linear bus with CPUs and Memory distributed along it
>in sections which can either be connected to the segments on either side
>of it or not, the implementation aspect becomes much more tractable.  One
>obvious result of this is that the scheduler must become much smarter;
>assigning tightly-coupled tasks to physically proximate CPUs.
> 
I have been working on a similar system (not here, but for my PhD
at The University of Sydney) and it is possible, there are some
problems (of course :-) If you are careful the bus can be
reduced to a single combinatorial circuit which is really nice.

>Rather than having single on/off switches to connect segments of busses,
>perhaps a dedicated limited-function CPU could also straddle the boundaries
>and serve as a message-transmitter/receiver across otherwise disconnected
>segments of the bus.  It would grab bus cycles during dead time of the
>main CPUs.  In this way, any CPU could talk with any other CPU, and the
>only penalty would be longer latency for physiclly disparate boxes.

In this case, why not try to improve the own/release times for the bus, so
that a CPU can talk to others by just grabbing the required segment(s) of
the bus.
If you are talking about having this limited function CPU do store and
forward then you end up with either "async cycles" which raises all the
problems with store and forward networks, acknowlegments, lost signals,
etc, etc, etc (see networking texts for a good list of these problems)
or with long (and I mean long) delays in completing a cycle.
In any case, you will want to eventually just make these interconnect
CPUs as powerful as the real CPU (why waste that board/system space, "we
can just run a small async job" is usually the first argument that will
be raised) and eventually you will end up with either the transputer
array/hypercube (only CPUs talking) or (and this one IS interesting)
a network of nodes (maybe hypercubed), but with each node being similar
in design to a Sequent type multi CPU backplaned machine.
>
>Any Master's candidates looking for a research topic???
>
>Jon P
>panek@hp-and.HP.COM
----
DISCLAIMER: this article concerns work that I have done at The
University of Sydney, Australia. It does NOT refer to any work
that I am doing at MIPS, and should not be taken as an indication
that MIPS is either involved, or not involved, in this area.
(I just wanted to make this clear).  Rex.
-- 
Rex di Bona		Penguin Lust is NOT immoral!
rex@mips.com		apply STD disclaimers here.