Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!wuarchive!emory!hubcap!toulouse From: toulouse@iro.umontreal.ca (Michel Toulouse) Newsgroups: comp.parallel Subject: superlinear speedup Message-ID: <11767@hubcap.clemson.edu> Date: 21 Nov 90 14:01:18 GMT Sender: fpst@hubcap.clemson.edu Organization: Universite de Montreal Lines: 431 Approved: parallel@hubcap.clemson.edu Two weeks ago, I sent a message on this news group asking for references about papers claiming superlinear speedup. Since then, I have received many references and also enjoy the reading of very interesting comments. My first plan was to take a close look to every papers and add my owns comments. But this may take a lost of time and many of those who send me references and comments did manifest their interest in receiving this list of references. So I do it now with this simple comment. The traditional argument against superlinear speedup is that "if a processus Q is running on p processors for t time, then there is a sequential machine that will simulate Q in at most (kpt) + k' time steps for k and k' being constants. So the speedup is by a factor of p only". As it was mention in many comments, there is superlinear based on avantages like more physical memory in the parallel machine or obtained by heuristically guided search. They are not in my point of view convincing arguments against the traditional scepticism. In the first case it is clearly not convincing since you can always add more memory to the sequential machine. In the second case if you partition the search space equally among the p processors, you can achieve very big speedup (as mention in one of the comments I received). But I think this, from a theoretical point of view, is meaningless since the speedup depend only on factors like the partition of the search space, the heuristic used, your random function, etc. However, relate to heuristics in search space, this paper: "The performance of cooperative processes", B.A. Huberman, Physica D42, 1990 38--47, claim superlinear speedup not based on partition of search space but on "hints" that cooperative processes might used to eliminate ininteresting subspaces. The paper give a formal explanation to this possible superlinear speedup. Will it be a counter-argument to the argument above, I don't know for the moment. The volume 42 of Physica D concentrate only on nonlinear phenomena, the discussion about emergent computation is traited extensively, I found many interesting papers, one about a complete characterizations of Turing machines with a restrained number of states (no problem you still can teach that the halting problem is undecidable), also many papers on neural computation. I think this review of Physica over emergent computation is a very serious work (as it is almost always the case in this Journal). At least (for me) this number introduce an approach to parallel computation that might challenge more seriously the argument against superlinear speedup. As I said before, I didn't have time to look at every paper, I'm anxious to find challenger to the traditional argument. I thank's very much every one who send me references and comments, I'm still very interesting to received any comments on this subject or relate one like emergent computation. Michel Toulouse Universite de Montreal email:toulouse@iro.umontreal.ca ------------------------------------------------------------------------- @inproceedings{RAO88fsttcs, AUTHOR = "Nageshwara Rao, V. and Kumar, Vipin", TITLE = "Superlinear Speedup in State-Space Search", BOOKTITLE = "Proceedings of the 1988 Foundation of Software Technology and Theoretcal Computer Science", YEAR = "December 1988", Note = "Lecture Notes in Computer Science number 338, Springer Verlag" } @techreport{RAO90suplin, AUTHOR = "V. Nageshwara Rao and Vipin Kumar", TITLE = "On the Efficicency of Parallel Depth-First Search", YEAR = "1990", INSTITUTION = "Tech Report 89-41, Computer Science Department, University of Minnesota", NOTE = "Submitted for publication" } ------------------------------------------------------------------------- @article{lihudak89 ,key={LiKai} ,author={Li, K. and Hudak, P.} ,title={Memory Coherence in Shared Virtual Memory Systems} ,journal={ACM Transactions on Computer Systems} ,volume=7 ,number=4 ,month=Nov ,year=1989 ,pages={321-359} } @phdthesis{liphd ,key={li} ,author={Li, K.} ,title={Shared Virtual Memory on Loosely Coupled Multiprocessors} ,school={Yale University, Department of Computer Science} ,year=1986 } ------------------------------------------------------------------------- Charlie McDowell and myself (David Helmbold) have a paper in the IEEE transactions on Parallel and Distributed Systems, Vol 1, No. 2 (april 1990) titled "Modeling Speedup (n) greater than n. Our references include: @techreport{sandia, AUTHOR = "Benner, R.E. and J.L. Gustafson and R.E. Montry", TITLE = "Development and analysis of scientific application programs on a 1024-processor hypercube", INSTITUTION = "Sandia National Laboratories", YEAR = "1988", MONTH = "February", NUMBER = "SAND 88-0317", ADDRESS = "Albuquerque, N.M.", note = ""} @article{Gus88, AUTHOR = "Gustafson, J.L.", TITLE = "Reevaluating {A}mdahl's Law", JOURNAL = "CACM", YEAR = "1988", volume = "31", number = "5", pages = "532-533", month = "May", note = ""} @article{EZL89, AUTHOR = "D.L. Eager and J. Zahorjan and E.D. Lazowska", TITLE = "Speedup versus Efficiency in Parallel Systems", JOURNAL = "IEEE Trans. on Comp.", YEAR = "1989", volume = "38", number = "3", pages = "408-423", month = "March", note = ""} @inproceedings{SF89, AUTHOR = "C.B. Stunkel and W.K. Fuchs", TITLE = "Analysis of Hypercube cache performance using address traces generated by {TRAPEDS}", BOOKTITLE = "Proc. 1989 International Conference on Parallel Processing", YEAR = "1989", pages = "I-33--I-40", note = ""} % @article{LS84, AUTHOR = "T. Lai and S. Sahni", TITLE = "Anomalies in parallel branch-and-bound algorithms", JOURNAL = "CACM", YEAR = "1984", volume = "27", number = "6", pages = "594-602", month = "June", note = ""} % Ohio State and Univ of Minnesota @article{Wei82, AUTHOR = "B. W. Weide", TITLE = "Modeling unusual behavior of parallel algorithms", JOURNAL = "IEEE Trans. on Computers", YEAR = "1982", volume = "C-31", number = "11", pages = "1126-1130", month = "November", note = ""} % Ohio State @article{LW86, AUTHOR = "G.-J. Li and B. W. Wah", TITLE = "Coping with anomalies in parallel branch-and-bound algorithms", JOURNAL = "IEEE Trans. on Comp.", YEAR = "1986", volume = "C-35", number = "6", pages = "568-573", month = "June", note = ""} % was Purdue now Univ of Illinois @inproceedings{PH88, AUTHOR = "B. R. Preiss and V. C. Hamacher", TITLE = "Semi-static dataflow", BOOKTITLE = "Proc. 1988 Inter. Conf. on Parallel Processing", YEAR = "1988", pages = "127-134", note = ""} % Univ of Waterloo and Univ of Toronto @article{FLW86, AUTHOR = "V. Faber and O. M. Lubeck and A. B. White Jr.", TITLE = "Superlinear speedup of an efficient sequential algorithm is not possible", JOURNAL = "Parallel Computing", YEAR = "1986", volume = "3", pages = "259-260", note = ""} % LANL @article{Par86, AUTHOR = "D. Parkinson", TITLE = "Parallel efficiency can be greater than unity", JOURNAL = "Parallel Computing", YEAR = "1986", volume = "3", pages = "261-262", note = ""} % DAP support unit, Queen Mary College, London @book{GSS87, AUTHOR = "E. F. Gehringer and D. P Siewiorek and Z. Segall", TITLE = "Parallel processing, the {Cm}* experience", PUBLISHER = "Digital press", pages = "227-231", YEAR = "1987"} @inproceedings{Li88, AUTHOR = "K. Li", TITLE = "{IVY}: a shared virtual memory system for parallel computing", BOOKTITLE = "Proc. 1988 Inter. Conf. on Parallel Processing", YEAR = "1988", pages = "94-101", note = ""} % Princeton @inproceedings{MG85, AUTHOR = "R. Mehrotra and E. F. Gehringer", TITLE = "Superlinear speedup through randomized algorithms", BOOKTITLE = "Proc. 1985 Inter. Conf. on Parallel Processing", YEAR = "1985", pages = "291-300", note = ""} % North Carolina State @article{Kor82, AUTHOR = "W. A. Kornfeld", TITLE = "Combinatorially implosive algorithms", JOURNAL = "CACM", YEAR = "1982", volume = "25", number = "10", pages = "734-738", month = "October", note = ""} % MIT @techreport{Fla87, AUTHOR = "H. P. Flatt", TITLE = "Further comments on a model for parallel processing", INSTITUTION = "IBM PASC G320-3503", YEAR = "1987", note = ""} @article{Miy88, AUTHOR = "E. Miya", TITLE = "Suggestion on Superlinear speed up terminology", JOURNAL = "Network news posting", YEAR = "1988", volume = "comp.parallel", number = "318", month = "December", note = ""} % NASA AMES @inproceedings{Smi81, AUTHOR = "B. J. Smith", TITLE = "Architecture and applications of the {HEP} multiprocessor computer", BOOKTITLE = "Real-Time signal processing IV", YEAR = "1981", volume = "298", pages = "241-248", note = ""} % @article{San86, AUTHOR = "J. Sanguinetti", TITLE = "Performance of a message-based multiprocessor", JOURNAL = "Computer", YEAR = "1986", volume = "19", number = "9", month = "September", pages = "47-55", note = ""} % just slightly superunity on an {ELXSI}, due to reduced {OS} overhead @article{Min70, AUTHOR = "M. Minsky", TITLE = "Form and Content in Computer Science, {ACM Turing Lecture}", JOURNAL = "J. ACM", YEAR = "1970", volume = "17", pages = "197-215", note = ""} % the log(n) Minsky conjecture @article{Lip88, AUTHOR = "J. Lipkis", TITLE = "Superlinear performance of a monte carlo code on the {NYU} {Ultracomputer}", JOURNAL = "Personal communication", YEAR = "1988", note = ""} % @article{Bar88, AUTHOR = "J. Barton", TITLE = "Superlinear performance of the {SAXPY} kernel on the {Silicon} {Graphics} {4D120/GTX}", JOURNAL = "Personal communication", YEAR = "1988", note = ""} % @article{GM85, AUTHOR = "D. H. Grit and J. R. McGraw", TITLE = "Programming Divide and Conquer for a {MIMD} Machine", JOURNAL = "Software Practice and Experience", YEAR = "1985", month = " January", volume = "15", number = "1", pages = "41-53", note = ""} @article{Amd67, AUTHOR = "G. Amdahl", TITLE = "Validity of the single processor approach to achieving large scale computing capabilities", JOURNAL = "Proceedings AFIPS Comp. Conf.", YEAR = "1967", volume = "30", pages = "483-485", note = ""} ------------------------------------------------------------------------- \bibitem{Kuma88c} Kumar, Vipin, Ramesh, K., and Nageshwara, Rao V., ``Parallel Best-First Search of State-Space Graphs: A Summary of Results", {\em AAAI 1988} : 122-127. \bibitem{Kuma89a} Kumar, Vipin, and Nageshwara, Rao V., ``Load Balancing on the Hypercube Architecture", {\em Proc. 4th Conf. on Hypercubes, Concurrent Computers and Applications}, March 1989. \bibitem{Nels90b} Nelson, Peter C., and Toptsis, Anestis A., ``Artificial Intelligence, Parallel Processing, and Bidirectional Heuristic Search", Submitted \bibitem{Nels90c} Nelson, Peter C., and Toptsis, Anestis A., ``Superlinear Speedup in Multiprocessor Bidirectional Heuristic State-Space Search", Submitted \bibitem{Nels90d} Nelson, Peter C., and Toptsis, Anestis A., ``Wave-Shaping in Parallel Bidirectional Heuristic State-Space Search", Submitted \bibitem{Rao87a} Rao, Nageshwara, V., Kumar, Vipin, and Ramesh, K., ``A Parallel Implementation of Iterative-Deepening-A*", {\em AAAI 1987}, 178-182. \bibitem{Rao87d} Rao, Nageshwara, V., and Kumar, Vipin, and Ramesh, K., ``Parallel Heuristic Search on a Shared Memory Multiprocessor", {\em Tech. Report. AI Lab, TR-87-45}, Univ. of Texas at Austin, January 1987. \bibitem{Rao88b} Rao, Nageshwara, V., and Kumar, Vipin, ``Concurrent Insertions and Deletions in a Priority Queue", {\em 1988 Inter. Conf. on Parallel Processing}, Vol. 3, (1988) : 207-210. ------------------------------------------------------------------------- In the Conpar90-Vapp 4 Proceedings you'll find M.Cosnard, J.L-Philippe Achieving Superlinear Speedups for the Multiple Polinomial Quadratic Sieve Factoring Algorithm on a Distributed Memory Multiprocessor p.863 - 874 The Proceedings have been printed at Springer Verlag, Germany and may be ordered by bookshops or with a 30% reduction directly at ********************************************************* * Institut fuer Informatik | phone : +41 61 321 99 67 * * Mittlere Strasse 142 | fax : +41 61 321 99 15 * * 4000 Basel | email : waser@urz.unibas.ch * * Switzerland * *************************************************************** ------------------------------------------------------------------------- Li & Wah "How to cope with anomalies in parallel approximate branch-andf-bound search". AAAI-84 (National Conf. on AI). McBurney & Sleep "Transputer based experimanets with the ZAPP architecture" In PARLE-87 (Parallel Architectures and Languages, Europe) conference, Springer LNCS 258. The former is a theoretical analysis, the latter gives some practical results. In both cases the superlinear is speedup is due to a single processor doing a long search of a tree with no solutions, while in a multiprocessor system this tree is searched in parallel with another in which a solution is easily found. ------------------------------------------------------------------------- David P. Helmbold, and Charles E. McDowell,"Modeling Speedup (n) Greater than n", IEEE Trans. on Parallel and Distributed Sys., Apr. 1990, pp. 250 - 256.