Path: utzoo!attcan!uunet!husc6!bbn!bbn.com!emiller
From: emiller@bbn.com (ethan miller)
Newsgroups: comp.arch
Subject: Re: Is Shared Memory Necessary?
Message-ID: <24877@bbn.COM>
Date: 23 May 88 15:09:24 GMT
References: <685@thalia.rice.edu> <43700039@uicsrd.csrd.uiuc.edu> <501@cmx.npac.syr.edu>
Sender: news@bbn.COM
Reply-To: emiller@bbn.com (ethan miller)
Organization: BBN Labs (Cambridge, MA)
Lines: 46
Summary:
Expires:
Sender:
Followup-To:
Keywords:

In article <501@cmx.npac.syr.edu> billo@cmx.npac.syr.edu (Bill O'Farrell) writes:
->In article <43700039@uicsrd.csrd.uiuc.edu> turner@uicsrd.csrd.uiuc.edu writes:
->
->>Why is it that everyone seems to assume that machines must either have
->>shared memory OR distributed memory, never a little of both?  From my
->>point of view I would like to see a machine with: fast local memory;
->>slower, but deeply pipelined, shared memory; and a thin global sync
->>bus (for barrier sync).  We have had memory heirarchies for decades
->>now, why should they cease to be useful now?
->
->I know of one architecture that has both local memory and globally
->shared memory -- the BBN Butterfly.  Is I understand it (correct me if
->I'm wrong), the processing nodes on the Butterfly each have memory
->which is local in the sense that it can be accessed very quickly by
->the local processor, but which is globally shared, in the sense that
->all the other processors can access it too, via the big global switch.
->Architecturally, each node "thinks" it has a very big globally shared
->memory, but software can take advantage of fast local access by
->arranging for each processor's most frequently accessed data to be in
->that portion of memory which is truly local.
->
->Hmm... I hope I've got that right.
->
->Bill O'Farrell, Northeast Parallel Architectures Center at Syracuse University
->(billo@cmx.npac.syr.edu)

The description of the Butterfly is essentially correct.  One element that
it is lacking is a memory associated with no processor node.  The delay for
a single longword (32 bit) read/write to another processor is about 4 micro-
seconds, while it's very fast (~250ns (?)) to local memory.  A nice
addition, which I think "turner" is referring to is an intermediate--a memory
accessible with the same delay to all nodes, along with local memory for
each node.  The Butterfly could do this with memory-only nodes, though
I don't know of any plans for those (I work for a different part of BBN).

Essentially, then, the Butterfly has a distributed-only memory model,
but accesses may SEEM like the memory is globally shared.  Memory transfer
between nodes is handled by the PNC, which is a microcoded processor.
Global sharing is an illusion created by this processor.

ethan
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
ethan miller              | "Quod erat demonstrandum, baby."
BBN Laboratories          | "Oooh, you speak French!"
ARPAnet : emiller@bbn.com |
PHONEnet: (617) 873-3091  | Disclaimer: It's MY opinion, not BBN's.