Path: utzoo!mnetor!uunet!lll-winken!lll-tis!ames!nrl-cmf!cmcl2!yale!lisper From: lisper@yale.UUCP (Bjorn Lisper) Newsgroups: comp.arch Subject: Re: Task synchronization in multiprocessors? Message-ID: <24462@yale-celray.yale.UUCP> Date: 4 Mar 88 01:39:31 GMT References: <317@amelia.nas.nasa.gov> <4530@lll-winken.llnl.gov> Reply-To: lisper@yale-celray.UUCP (Bjorn Lisper) Organization: Yale University Computer Science Dept, New Haven CT Lines: 33 In article <4530@lll-winken.llnl.gov> brooks@lll-crg.llnl.gov.UUCP (Eugene D. Brooks III) writes: >In article <317@amelia.nas.nasa.gov> fouts@orville.nas.nasa.gov (Marty >Fouts) writes: >>I've just finished Stones recent book (High Performance Computer >>Architecture) which has left me wanting to know more about >>architectural support for task synchronization in multiprocessors. >> >>I'm particularly interested in production MIMD shared memory systems >>and alternatives to semaphore critical region or barrier >>synchronization. A large part of the problem with using a semaphore >>for synchronization is that it introduces a hot spot in the memory >>system and forces some potentially parallel activities to be serial. >For a bottleneck free barrier synchronization algorithm, which is heavily >used on several existing multiprocessors, see "The Butterfly Barrier", >International Journal of Parallel Programming, Vol. 15, No. 4, Aug 86, >pp. 295-307. This software algorithm, which is bottleneck free and runs >in logarithmic time, is critically compared to direct hardware barrier .... If you are interested in what might come in the future, check Ranade, Bhatt, Johnsson: "The Fluent Abstract Machine", technical report YALEU/DCS/TR-573, CS Dept., Yale University, January 1988. Abhiram (Ranade) has found a simple way of routing in a distributed system so that memory accesses with overwhelming probability will take logarithmic time (in the number of processors) no matter how far the requesting processor is from the processor with the referred memory. This will make the machine look as a shared memory architecture without the congestion bottlenecks such an architecture has if implemented directly. I think copies can be requested from Ms Susan McBride, mcbride-susan@cs.yale.edu. Bjorn Lisper