Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!apple!apple.com!desnoyer
From: desnoyer@apple.com (Peter Desnoyers)
Newsgroups: comp.lang.c
Subject: Re: sort
Message-ID: <2567@internal.Apple.COM>
Date: 29 Jun 89 19:47:41 GMT
Sender: usenet@Apple.COM
Organization: Apple Computer, Inc.
Lines: 49
References:<166@stsim.UUCP> <4700040@m.cs.uiuc.edu> <1989Jun29.155648.28819@utzoo.uucp>

In article <1989Jun29.155648.28819@utzoo.uucp> henry@utzoo.uucp (Henry 
Spencer) writes:
> In article <4700040@m.cs.uiuc.edu> kenny@m.cs.uiuc.edu writes:
> >>Does anyone have a sort routine (in memory sort) faster than 
quicksort, of
> >>so please e-mail it to me, thanks.
> >
> >Generally speaking, there isn't any; at least, no sort can be faster
> >than quicksort by more than a constant factor on a von Neumann machine. 
 
> 
> There is an important phrase missing here:  "in the worst case".  There
> are sort algorithms which are considerably better (linear rather than
> linear-log) in the *typical* case.  Or so I am told.

Actually, quicksort is O(N**2) worst case and O(NlogN) typical case. It can
be modified to be O(NlogN) in all cases, at the cost of a steep (but
constant) performance hit. (This modified qsort is only of academic 
interest.)

Depending on how you define "sort", the theoretical bound is either 
O(NlogN) or O(N). If you can only compare two elements and determine 
<, =, or >, then the limit is O(NlogN), which is achieved by quicksort. 
If you can express each element as a string of symbols in a finite 
alphabet you can do a radix sort - e.g. (for purely alphabetic data)

   rsort(list)
     if length of list == 1 return list
     else
        look at first letter of each element, put into 1 of 26 buckets
        rsort each bucket
        concatenate each sorted bucket, return resulting list
   end rsort

You can determine that the running time of this algorithm is proportional
to the sum of the lengths of the strings to be sorted - e.g. O(N) instead
of O(NlogN). However, you need to know the data representation - you 
can't write a general purpose radix sort.

The last thing to remember is that O(N**2) isn't always worse than 
O(NlogN). A good practical hash algorithm may take time kN**2, where 
k is really small, compared to a theoretically "better" algorithm 
that takes time KNlogN, where K is really big. Unless the size of your
data set approaches infinity (e.g. census data) the practical algorithm
will be better.

                                      Peter Desnoyers
                                      Apple ATG
                                      (408) 974-4469