Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!yale!cmcl2!kramden.acf.nyu.edu!brnstnd From: brnstnd@kramden.acf.nyu.edu (Dan Bernstein) Newsgroups: comp.lang.misc Subject: Re: A very brief history of optimal sorting methods Message-ID: <5293:Nov1518:36:0490@kramden.acf.nyu.edu> Date: 15 Nov 90 18:36:04 GMT References: <4440:Nov1405:06:2590@kramden.acf.nyu.edu> <5948@lanl.gov> Organization: IR Lines: 48 In article <5948@lanl.gov> jlg@lanl.gov (Jim Giles) writes: > From article <4440:Nov1405:06:2590@kramden.acf.nyu.edu>, by brnstnd@kramden.acf.nyu.edu (Dan Bernstein): > > [...] > > Sorting is linear in the number of bytes being sorted. The length of the > > keys is absolutely, entirely irrelevant to this conclusion. > Radix sorting is linear in the number of bytes perhaps. No, not ``perhaps.'' Sorting is linear in the number of bytes being sorted. > It is also > linear in the number of keys to be sorted _times_ the length of each > key. No, it isn't. That is an upper bound which is rarely true in practice. Please don't say anything if you don't know what you're talking about. > The speed of sorting techniques is _uniformly_ given by _all_ > the literature as a function of the number of _keys_ to be sorted, > _not_ as a function of the number of bytes sorted. You are wrong. Here's a counterexample: In 1987, in the article by Tarjan and somebody else ``inventing'' MSD radix sort, the time bounds were given in terms of the number of bytes. Less precise bounds were given in terms of the number of keys under various restrictions. The references in that article contain lots more counterexamples. Please don't say anything if you don't know what you're talking about. > When the number > of _bytes_ sorted is greater than N log N (where N is the number of > _keys_), it is often better to use some other sort. Rarely. The number of bytes in the termcap file here is about five times bigger than n lg n, where n is the number of lines. It is forty times bigger than n log_256 n. Yet my radix sort takes less time on it than the supposedly optimized merge sort that appeared in the source groups a couple of years ago. Please don't say anything if you don't know what you're talking about. > If a whole key (like a 64-bit integer for example) can be compared in > the same time as it takes to compare your little 8-bit chunk of it, Radix sorts do not compare the bytes they are sorting. Please don't say anything if you don't know what you're talking about. ---Dan