Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!wuarchive!udel!haven!adm!cmcl2!kramden.acf.nyu.edu!brnstnd From: brnstnd@kramden.acf.nyu.edu (Dan Bernstein) Newsgroups: comp.lang.misc Subject: Re: Sorting bounds Message-ID: <6327:Nov2022:30:2490@kramden.acf.nyu.edu> Date: 20 Nov 90 22:30:24 GMT References: <1990Nov20.202143.10746@craycos.com> Organization: IR Lines: 40 Skip if you know why radix sort does not observe the n log n bound. In article <1990Nov20.202143.10746@craycos.com> jrbd@craycos.com (James Davies) writes: > Sorry to inject some mathematics into this lovely flamefest, but... > The bound on sorting is O(N log N) comparisons because you must be able to > generate all of the N! possible permutations of the N inputs. That bound is only valid if the only permitted operation is comparison. > N! grows approximately as N**N (Stirling's approximation, I believe it's > called). No, n does not grow as n^n. Stirling's (first) approximation is that n! is about n^n/e^n times the square root of 2 pi n. > Radix or bucket sorts may appear to be linear, but in fact they are also > N log N in comparisons. (Picking one of K buckets requires a K-way > comparison.) Picking one of K buckets does not require a K-way comparison. Even if it did, that would not be relevant. Radix sort is linear in the number of bytes, which can be o(n log n) if there are many duplicate keys. You said you would ``introduce some mathematics into this lovely flamefest.'' But mathematics is a tautology. Asserting that a false statement is ``in fact'' true does not make it mathematics. [ on the supposed fact that radix sorting is n log n ] > This is most evident when you try to make your bucket sort scale up to > arbitrary sizes (like, say, sorting a trainload of tapes...) The scale factor between sorting 100 of Ritchie's boxcars and just 1 of them is almost exactly 100---internal sorting by far dominates any merging. So for practical purposes (even in such a ridiculously large situation) sorting is linear in the number of bytes. And for theoretical purposes, since any immediately accessible byte of memory is accessible in constant time under any realistic computational model, sorting is also linear in the number of bytes. ---Dan