Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!husc6!cca!g-rh From: g-rh@cca.CCA.COM (Richard Harter) Newsgroups: comp.misc,comp.lang.c Subject: Re: Fast, stable, free, small sorts? Message-ID: <20050@cca.CCA.COM> Date: Mon, 21-Sep-87 17:18:45 EDT Article-I.D.: cca.20050 Posted: Mon Sep 21 17:18:45 1987 Date-Received: Wed, 23-Sep-87 00:37:45 EDT References: <460@naucse.UUCP> <19959@cca.CCA.COM> <461@naucse.UUCP> Reply-To: g-rh@CCA.CCA.COM.UUCP (Richard Harter) Organization: Computer Corp. of America, Cambridge, MA Lines: 79 Summary: Can be done, not easy. Xref: mnetor comp.misc:1273 comp.lang.c:4469 In article <461@naucse.UUCP> sbw@naucse.UUCP (Steve Wampler) writes: > >It was requested that I provide more information on my needs, so: > >I want a stable, minimum storage sort that takes exactly the >same arguments as qsort (section 3 of the UNIX man pages). >It should be reasonably fast (i.e. not N^2). I wish I could help you more. I don't have any references at hand that meet your needs. I could write such a sort, but I won't do it for free, which doesn't help you. The following may be of use: If I'm not missing anything these are the requirments that you want met: (1) Stable (2) At least O(N log^2(N)) in time and preferably better (3) Fixed internal storage usage (O(1) additional storage) (4) Fixed record size specified in input. (5) Comparison function passed through calling sequence. (6) Data in memory, referenced by calling sequence (7) Sort in place. I can think of two ways to meet them, offhand. One is a modified quicksort, sketched out below. The other is a modified mergesort. There is a trick for doing a mergesort in place. I haven't thought it through thoroughly so I will defer that one. Quicksort can be converted into a stable, O(N log N) time, O(1) space sort. The price is added complexity and a larger time factor. The essential elements are (a) altering the partition algorithm to be stable, and (b) forcing the partition element to be the median. Stabilizing the quicksort partition The quicksort partition algorithm is not stable. To make it stable we make a first pass over the data and count the number of elements less than, equal to, and greater than the test value. We now know the boundaries of the three segments. We make a second pass and count the number in each category within each segment. This gives us a total of 9 sub-segments. Each sub-segment is filled from the front, ping-pong style. I.e. there is a separate index into each sub-segment; when we are placing an element we put into the sub-segment corresponding to its segment origin and destination and pop the element contained there-in. This is a stable, O(N) partition. 1) The fact that it takes three passes rather than one does not alter the fact that it is O(N). 2) This is the general partition algorithm for arbitrary test value. 3) Coding this is not trivial. Forcing the median as partition element Quicksort, in recursive form, is O(1) in space for each invocation; however the number of invocations is log N, whereby the frame space ends up being O(log N). When we convert quicksort to nonrecursive form we end up with a stack of begin-end indices of length log(N). We get rid of this stack by forcing the partition element to be the median of the data set. We no longer need the begin-end stack because all begin-end indices may be calculated explicitly. [This is not entirely trivial -- there is some nasty fiddling with odds and evens to do, but it can be done.] We may use the standard 'find-median' algorithm (with the stabilized partition algorithm). This is still O(N) but has a bad worst case. We can improve the worst case by a variant of the median of medians trick (which can't be used directly because of the storage constraint.) Summary: Quicksort can be converted into a stable partition sort which is time O(N log N) on average and additional space O(1). -- In the fields of Hell where the grass grows high Are the graves of dreams allowed to die. Richard Harter, SMDS Inc.