Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!apple!apple.com!desnoyer From: desnoyer@apple.com (Peter Desnoyers) Newsgroups: comp.lang.c Subject: Re: sort Message-ID: <2567@internal.Apple.COM> Date: 29 Jun 89 19:47:41 GMT Sender: usenet@Apple.COM Organization: Apple Computer, Inc. Lines: 49 References:<166@stsim.UUCP> <4700040@m.cs.uiuc.edu> <1989Jun29.155648.28819@utzoo.uucp> In article <1989Jun29.155648.28819@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes: > In article <4700040@m.cs.uiuc.edu> kenny@m.cs.uiuc.edu writes: > >>Does anyone have a sort routine (in memory sort) faster than quicksort, of > >>so please e-mail it to me, thanks. > > > >Generally speaking, there isn't any; at least, no sort can be faster > >than quicksort by more than a constant factor on a von Neumann machine. > > There is an important phrase missing here: "in the worst case". There > are sort algorithms which are considerably better (linear rather than > linear-log) in the *typical* case. Or so I am told. Actually, quicksort is O(N**2) worst case and O(NlogN) typical case. It can be modified to be O(NlogN) in all cases, at the cost of a steep (but constant) performance hit. (This modified qsort is only of academic interest.) Depending on how you define "sort", the theoretical bound is either O(NlogN) or O(N). If you can only compare two elements and determine <, =, or >, then the limit is O(NlogN), which is achieved by quicksort. If you can express each element as a string of symbols in a finite alphabet you can do a radix sort - e.g. (for purely alphabetic data) rsort(list) if length of list == 1 return list else look at first letter of each element, put into 1 of 26 buckets rsort each bucket concatenate each sorted bucket, return resulting list end rsort You can determine that the running time of this algorithm is proportional to the sum of the lengths of the strings to be sorted - e.g. O(N) instead of O(NlogN). However, you need to know the data representation - you can't write a general purpose radix sort. The last thing to remember is that O(N**2) isn't always worse than O(NlogN). A good practical hash algorithm may take time kN**2, where k is really small, compared to a theoretically "better" algorithm that takes time KNlogN, where K is really big. Unless the size of your data set approaches infinity (e.g. census data) the practical algorithm will be better. Peter Desnoyers Apple ATG (408) 974-4469