Path: utzoo!attcan!uunet!husc6!bbn!bbn.com!emiller From: emiller@bbn.com (ethan miller) Newsgroups: comp.arch Subject: Re: Is Shared Memory Necessary? Message-ID: <24877@bbn.COM> Date: 23 May 88 15:09:24 GMT References: <685@thalia.rice.edu> <43700039@uicsrd.csrd.uiuc.edu> <501@cmx.npac.syr.edu> Sender: news@bbn.COM Reply-To: emiller@bbn.com (ethan miller) Organization: BBN Labs (Cambridge, MA) Lines: 46 Summary: Expires: Sender: Followup-To: Keywords: In article <501@cmx.npac.syr.edu> billo@cmx.npac.syr.edu (Bill O'Farrell) writes: ->In article <43700039@uicsrd.csrd.uiuc.edu> turner@uicsrd.csrd.uiuc.edu writes: -> ->>Why is it that everyone seems to assume that machines must either have ->>shared memory OR distributed memory, never a little of both? From my ->>point of view I would like to see a machine with: fast local memory; ->>slower, but deeply pipelined, shared memory; and a thin global sync ->>bus (for barrier sync). We have had memory heirarchies for decades ->>now, why should they cease to be useful now? -> ->I know of one architecture that has both local memory and globally ->shared memory -- the BBN Butterfly. Is I understand it (correct me if ->I'm wrong), the processing nodes on the Butterfly each have memory ->which is local in the sense that it can be accessed very quickly by ->the local processor, but which is globally shared, in the sense that ->all the other processors can access it too, via the big global switch. ->Architecturally, each node "thinks" it has a very big globally shared ->memory, but software can take advantage of fast local access by ->arranging for each processor's most frequently accessed data to be in ->that portion of memory which is truly local. -> ->Hmm... I hope I've got that right. -> ->Bill O'Farrell, Northeast Parallel Architectures Center at Syracuse University ->(billo@cmx.npac.syr.edu) The description of the Butterfly is essentially correct. One element that it is lacking is a memory associated with no processor node. The delay for a single longword (32 bit) read/write to another processor is about 4 micro- seconds, while it's very fast (~250ns (?)) to local memory. A nice addition, which I think "turner" is referring to is an intermediate--a memory accessible with the same delay to all nodes, along with local memory for each node. The Butterfly could do this with memory-only nodes, though I don't know of any plans for those (I work for a different part of BBN). Essentially, then, the Butterfly has a distributed-only memory model, but accesses may SEEM like the memory is globally shared. Memory transfer between nodes is handled by the PNC, which is a microcoded processor. Global sharing is an illusion created by this processor. ethan +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ethan miller | "Quod erat demonstrandum, baby." BBN Laboratories | "Oooh, you speak French!" ARPAnet : emiller@bbn.com | PHONEnet: (617) 873-3091 | Disclaimer: It's MY opinion, not BBN's.