Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!husc6!sri-unix!quintus!ok From: ok@quintus.UUCP (Richard A. O'Keefe) Newsgroups: comp.lang.c Subject: Re: sorting (qsort) Message-ID: <265@cresswell.quintus.UUCP> Date: Mon, 23-Nov-87 20:19:48 EST Article-I.D.: cresswel.265 Posted: Mon Nov 23 20:19:48 1987 Date-Received: Thu, 26-Nov-87 20:16:40 EST References: <10491@brl-adm.ARPA> Organization: Quintus Computer Systems, Mountain View, CA Lines: 136 Summary: Party line In article <10491@brl-adm.ARPA>, bard@THEORY.LCS.MIT.EDU (Bard Bloom) replied to my question about the wide-spread (mis)use of "quick"sort. He starts by saying that > Quicksort -- if you code it just right -- has a very nice inner loop: This is wildly, ludicrously, hilariously IRRELEVANT to generic sorting algorithms. What do I mean by a "generic" sorting algorithm? I mean one like the qsort() function in the ANSCI C draft which takes a user- supplied function, or one like the sort(BU_CMD) command in the SVID which takes a user-supplied sort key specification. In fact, even if you are sorting strings, the book-keeping in the inner loop is not of much interest. There are about five different flavours of "oh" notation. The one I meant was the one where f(n) = o(g(n)) if lim[n->infinity] ((f(n)-g(n))/g(n)) = 0 that is, where f(n) is approximately equal to g(n) WHERE THE CONSTANT FACTOR IS SIGNIFICANT. I'm afraid this keyboard hasn't got thetas or omegas, so I left it to common sense to work out what I meant. Let me give you part of a table from Kurt Melhorn's "Data Structures and Algorithms 1: Sorting and Searching", Springer-Verlag. The table is on page 66. I have omitted the column for A-sort, and the row for run-time (because it is expressed in rather odd units, and makes the inappropriate assumption that comparisons are cheap). Single Maximum HeapSort QuickSort MergeSort selection # of comparisons+------------+------------+-------------+----------+ worst case | 0.5 n**2 | 2 n lg n | 0.5 n**2 | n lg n | average case | 0.5 n**2 |~ 2 n lg n | 1.44 n lg n | n lg n | +------------+------------+-------------+----------+ storage | n | n | n + lg n | 2n | requirement | + const | + const | + const | + const | +------------+------------+-------------+----------+ access | random | random | random | sq'ntial | +------------+------------+-------------+----------+ stable? | no | no | no | yes | +------------+------------+-------------+----------+ The constant factors MATTER here. The average case of heapsort, quicksort, and merge sort is O(NlgN), BUT heapsort is twice as costly as merge sort and quicksort is 1.44 times as costly. The "storage requirement" line is misleading. Suppose you are sorting /etc/passwd on a PDP-11. This is a typical application of a sorting routine; I can count the number of times I have sorted a sequence of numbers -- other than to test a sorting routine -- in the last eight years on the fingers of no hands, but I sort files of text every day. The /etc/passwd file here has 76 lines for a total of 4458 bytes. The overhead to enable merge-sort (on a PDP-11) is thus 76*sizeof (short) bytes, which comes to 3%. Trading 3% more space for 30% to 40% less time seems like a good bargain to me. More generally, if you are sorting a sequence of varying-length strings, you have to sort an array of pointers if you want to use qsort(), and that's N locations of overhead, whereas storing links with the strings so you can sort a list costs you, oh my goodness, exactly N locations of overhead. At least for applications where you are sorting variable-sized objects, the alleged space advantage of "quick" sort turns out to be bogus too. Bloom quotes Sedgewick as advising that "if one is not willing to invest the effort to be sure that a Quicksort implementation is not flawed, Shellsort is a much safer choice." True. But also according to Sedgewick {page 98} "The functional form of the running time of Shellsort is not even known .... two conjectures are (lg N)**2 * N and N**1.25." I don't know what YOU think about a book which recommends a complicated unanalysed method like Shellsort when there is a trivially simple method (mergesort) which outperforms it -- not only in order of magnitude but in constant factor -- but I know what *I* think. Bloom says > Most people don't worry about the constant factors, because they > vary a lot with the model of computation. When you are counting the number of times that some operation is done, the model of computation has nothing to do with the case, tra-la. This may be the heart of the matter: if "most people don't worry about the constant factors" that may explain why the qsort() routine in SunOS 3.2 was (31/19 - 1) = 63% slower at sorting an array of strings than a naive implementation of merge sort in one particular test (using strcmp() as the comparison function). I expect the factor to be higher on VAXen, where procedure calls are much more expensive, and less on SPARCs, where procedure calls should be cheaper. I have no reason to believe that the qsort() routine in SunOS 3.2 is not identical to either the 4.2BSD or the SYS V.2 one, and I doubt that Bloom has either. If the sorting routine in UNIX isn't "carefully tweaked", it ---- well should be after all this time. But perhaps nobody cared about constant factors. [I hasten to add that in general I have the greatest admiration for the SUN operating system and for their machines, and that I expect that they copied what they were told was the best code.] The fact that the AVERAGE case of "quick" sort does about 1.4 times as many comparisons as the WORST case of merge sort has been known ever since quicksort was analysed. The fact that the cost of calling a user-supplied comparison routine completely swamps any reasonable book-keeping has been obvious since the beginning. The conclusion which follows from this is that if the comparisons done by a sorting routine are single in-line machine instructions comparable in cost to an add or move, then and only then quicksort is a good method otherwise merge sort beats the pants off quicksort However, this conclusion seems not to have been drawn. Over the years I have measured the speeds of heapsort, quicksort, and merge sort on DEC-10s (using Simula, Pop-2, and Prolog), PDP-11s and VAX-11s (using C) and 680x0s (using C and Prolog). There are wide variations between languages and machines, but the ratio heapsort : quicksort : mergesort :: 2 : 1.4 : 1 holds reasonably well within each machine/language combination. I'm afraid that Bard Bloom has not provided evidence to refute this conclusion, but has merely repeated the Party line. The questions remain: (1) does anyone in net-land know of any references which purport to show that quick-sort does fewer comparisons than merge-sort? (2) does anyone in net-land know of any studies comparing actual implementations of quick-sort and merge-sort where the comparisons were not single in-line machine instructions but were procedure calls in which quick-sort outperformed merge-sort? (3) has anyone got an ancedote about an application which was originally coded using merge-sort but had to be recoded usin quick-sort (or any other in-place sort) to fit into a small machine? (4) does anyone know of a general-purpose sorting algorithm which consistently out-performs merge-sort? If anyone wants to conduct their own studies, DON'T try your sorting routine out on an array of numbers. I suggest random permutations of random subsets of /usr/dict/words as being more realistic. I think this discussion does belong in *.lang.c, because presumably people writing in C care about efficiency and portability, and if it turns out (a) that qsort() is often implemented as some version of quick-sort and (b) that quick-sort is as bad as I claim, then a lot of us will be looking for something better to use.