Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!yale!cmcl2!kramden.acf.nyu.edu!brnstnd From: brnstnd@kramden.acf.nyu.edu (Dan Bernstein) Newsgroups: comp.lang.misc Subject: Re: Lying Message-ID: <26463:Nov1919:54:1690@kramden.acf.nyu.edu> Date: 19 Nov 90 19:54:16 GMT References: <6097@lanl.gov> <4298@goanna.cs.rmit.oz.au> <246@smds.UUCP> Organization: IR Lines: 22 In article <246@smds.UUCP> rh@smds.UUCP (Richard Harter) writes: > The good news is that you don't need uniformity - almost any relatively > smooth distribution will do. The bad news is that many real life > sorting problems are distinctly not smooth. The point of what I call ``adaptive distribution sort'' is that MSD radix sorting can take advantage of all the same tricks as adaptive quadrature. But any MSD radix sort is linear in the number of bytes. > It should also be noted that bucket sorts, distribution sorts, math sorts, > et. al. are linear only within a range. [If the file size is large enough > the implicit assumptions about short data access times fail.] All immediately accessible storage, whether internal or external, is accessible within a fixed time. Any computational model that accurately reflects this fact will admit sorting linear in the number of bytes. In practice the large constant factor makes record movement desirable, so a properly designed merge-radix sort is one of the best choices for external data sets. ---Dan