Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!seismo!sundc!pitstop!sun!quintus!ok From: ok@quintus.UUCP (Richard A. O'Keefe) Newsgroups: comp.lang.c Subject: sorting (qsort) Message-ID: <187@cresswell.quintus.UUCP> Date: Sun, 15-Nov-87 20:04:32 EST Article-I.D.: cresswel.187 Posted: Sun Nov 15 20:04:32 1987 Date-Received: Tue, 17-Nov-87 01:55:15 EST Organization: Quintus Computer Systems, Mountain View, CA Lines: 45 Keywords: sorting library The UNIX library includes qsort(3), and explicitly says that "qsort is an implementation of the quicker-sort algorithm". SAS Lattice C for IBM VM/CMS and MVS also claims to use "quicksort". Some microcomputer Cs also use some version of quicksort. My question is: do these people know something about quicksort that I don't? Contrast merge sort and quicksort: (1') merge sort does o(NlgN) comparisons in the WORST case. (2') merge sort can be coded to take o(N) comparisons when the input is already in order. (3') merge sort is stable, so it is good for multi-key sorting. (4') merge sort can be coded easily in a non-recursive language, using a stack of o(lgN) pointers. (5') merge sort is well adapted to sorting lists or permutation vectors. (6') to sort N elements, merge sort needs N additional words for links, but if you have allocated your records dynamically you probably want the links anyway. (1") quick sort does o(1.4NlgN) comparisons on the AVERAGE. In the worst case it does O(N**2) comparisons. (2") quick sort cannot exploit existing order; in fact it has to go to some trouble to ignore it. (3") quick sort is not stable, so multi-key sorting has to be done by writing a combined comparison. (4") quick sort can be coded easily in a non-recursive language, using a stack of o(2lgN) pointers. (5") quick sort is not well adapted to sorting lists. (6") quick sort doesn't need the links. On the other hand, if the data items are large, you would be silly to sort the array in place, and would sort an array of pointers instead, so you are generally paying for the links anyway. Actual measurements seem to bear this out: a naive implementation of merge sort on a SUN-3/50 running SunOs 3.2 took 19NlgN usec to sort an array of N strings using strcmp(), whereas qsort took 31NlgN usec: this is roughly the 1:1.4 ratio predicted by the number of comparisons. Have C implementors been hypnotised by the magic word "quick" in the name of "quick" sort, or am I missing something? 30% overhead seems a lot to pay just for a name.