Path: utzoo!utgpu!news-server.csri.toronto.edu!clyde.concordia.ca!mcgill-vision!snorkelwacker!apple!usc!samsung!munnari.oz.au!goanna!ok From: ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) Newsgroups: comp.lang.prolog Subject: Re: Prolog's Pitfalls Revisited Message-ID: <3367@goanna.cs.rmit.oz.au> Date: 3 Jul 90 07:09:39 GMT References: <1990Jun26.170835.11037@chaos.cs.brandeis.edu> <3309@usceast.UUCP> Organization: Comp Sci, RMIT, Melbourne, Australia Lines: 53 In article <3309@usceast.UUCP>, mgv@usceast.UUCP (Marco Valtorta) writes: > It seems that Harel does know that quicksort has been known to be inferior > for nearly 30 years :-). More seriously, I do not think that either > Knuth or Sedgewick support the view that quicksort is "inferior." > All experimental data (on collections of "random" arrays) I know > of (admittedly not recent data) points > to quicksort being faster that mergesort on average. The *worst* case behaviour of quicksort is not what I was referring to. What I was referring to is the well-known fact that the average case of quicksort does about 30% more comparisons than the *worst* case of merge sort. This information *IS* in Knuth Volume 3 and in Sedgewick's "Algorithms" (at least the 1st edition, I haven't seen the second). My reference to the known inferiority of quicksort was of course to Hoare's 1961 paper in Computer Journal, where quicksort was first published. He gave the formula there for the number of comparisons done by quicksort, and the number of comparisons done by mergesort was known when the paper was submitted in 1960 to be LESS. Unfortunately, there is a tradition of performing sorting experiments on arrays of integers. This is certainly convenient for coding, because then you can use the language's built-in "<" operator. It is, however, wildly unrealistic if you are NOT sorting things known at compile time to be integers. Now, if you KNOW at compile time that you are sorting integers, then there are sorting algorithms with O(N) average cost (e.g. radix sort). If you don't know that you have integers, e.g. you are sorting strings, then the cost of comparisons dominates, and quick sort loses *badly*. Sedgewick's paper in Acta Informatica contains detailed cost formulas. It turns out that to get the favourable result for quicksort that everyone has taken as Gospel, he assumes that a comparison costs 1/4 as much as a memory reference. In his dissertation, he was looking at code on a /370 or that kind of machine, and charging a CR instruction as 1/4 of an L instruction is not unreasonable. But that ONLY applies when you know at compile time that you are sorting integers. It does not apply when you are sorting strings or atoms or doing a procedure call for your comparison. > I would like to know of references to data that supports the opposite view. Look at any paper that compares quicksort and merge sort for anything but integers. It's worth pointing out that in a language which lacks arrays (such as most Prologs and many functional languages) the space advantage of quick sort does not exist in any form at all, and if you are sorting variable- length records, there is no space advantage for quicksort in conventional languages either. -- Science is all about asking the right questions. | ok@goanna.cs.rmit.oz.au I'm afraid you just asked one of the wrong ones. | (quote from Playfair)