Path: utzoo!attcan!uunet!husc6!rice!titan!retrac From: retrac@titan.rice.edu (John Carter) Newsgroups: comp.arch Subject: Re: Is Shared Memory Necessary? Message-ID: <685@thalia.rice.edu> Date: 10 May 88 16:31:01 GMT References: <503@xios.XIOS.UUCP> <2676@pdn.UUCP> <674@cernvax.UUCP> <9559@sol.ARPA> Sender: usenet@rice.edu Reply-To: retrac@rice.edu (John Carter) Organization: Rice University, Houston Lines: 39 In article <9559@sol.ARPA> crowl@cs.rochester.edu (Lawrence Crowl) writes: >In article <674@cernvax.UUCP> hjm@cernvax.UUCP (Hubert Matthews) writes: >>Surely the highest bandwidth is achieved >>when each processor has its own memory which it shares with noone else? It >>also makes the hardware a lot smaller. ... shared-memory is not necessary; >>it's a software issue that shouldn't be solved in hardware. > >Yes, the highest bandwidth is achieved when when each processor has exclusive >memory. However, processes on different processors must still communicate with >each other. Non-shared memory communication typically costs two orders of >magnitude more than shared memory communication. What's worse, even when >processes are on the same processor, software engineering issues often require >that they communicate via the same slow inter-processor mechanism. So the >fastest time to solve a problem may well lie with shared memory even though its >bandwidth is lower. Communication is a software issue. Making it cheap >requires hardware. My conclusion is that shared-memory is not necessary, but >that it is worth its cost. I think that it's currently worth the cost, but I doubt seriously that it'll be a realistic approach as we try to increase the number of processors beyond several dozen. A shared (hardware) memory that needed to handle thousands of processors would be much slower than a conventional memory (how much slower would depend on the actual architectute and what decisions you made about the semantics of memory access). The RP-3 project at IBM is doing some interest- ing work on large shared memory architectures. Their work aside, I think that you should be able to get "optimal" performance by providing very fast dsistributed memory communication, hardward support for IPC (interprocess[or] communication), intelligent/clever communication protocols, and intelligent compilers that can schedule the IPC so as to remove or reduce the delay associated with waiting for remote memory access. Kai Li (at Princeton, I believe) has done some very good work on implementing shared memory on top of a distributed memory machine/system (i.e., the architecture is distributed memory, but the programmer's view if of shared memory). John Carter Internet: retrac@rice.edu Dept of Computer Science CSNET: retrac@rice.edu Rice University UUCP: {internet node or backbone}!rice!retrac Houston, TX