Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!yale!cmcl2!lanl!jlg From: jlg@lanl.gov (Jim Giles) Newsgroups: comp.lang.misc Subject: Re: A very brief history of optimal sorting methods Message-ID: <6056@lanl.gov> Date: 15 Nov 90 20:11:56 GMT References: <5293:Nov1518:36:0490@kramden.acf.nyu.edu> Organization: Los Alamos Natl Lab, Los Alamos, N.M. Lines: 82 From article <5293:Nov1518:36:0490@kramden.acf.nyu.edu>, by brnstnd@kramden.acf.nyu.edu (Dan Bernstein): > [... ] > Please don't say anything if you don't know what you're talking about. A statement that applies to _you_ far more often than to me. > [...] >> The speed of sorting techniques is _uniformly_ given by _all_ >> the literature as a function of the number of _keys_ to be sorted, >> _not_ as a function of the number of bytes sorted. > > You are wrong. Here's a counterexample: In 1987, in the article by > Tarjan and somebody else ``inventing'' MSD radix sort, [...] Oh wow! A _single_ paper (which you can't even give a proper citation of) which goes against the grain of the _whole_ rest of the industry. Big deal. > [...] Less precise bounds were > given in terms of the number of keys under various restrictions. [...] So, even in your supposed counter-example paper the authors _do_ give the speed in terms of the number of _keys_. This, as I pointed out before, _is_ the standard practice of the industry. Everyone in the industry will _assume_ (correctly - for all but Dan) that a quoted sort speen is in terms of the number of _keys_ - unless _explicitly_ stated otherwise. I suspect that your cited paper _does_ make such an explicit caveat. _YOUR_ original claims _did_not_. > [...] > Please don't say anything if you don't know what you're talking about. Again, heed your own advice. > [...] > Rarely. The number of bytes in the termcap file here is about five times > bigger than n lg n, where n is the number of lines. It is forty times > bigger than n log_256 n. Yet my radix sort takes less time on it than > the supposedly optimized merge sort that appeared in the source groups a > couple of years ago. Woopee! It is my experience with _all_ "supposedly optimized" UNIX utilities that they are incredibly slow. The fact that you can use a slow technique and still beat a UNIX utility is hardly a surprise. Further, sorting a termcap file is hardly a very interesting test: text sorting of that kind is one of the _few_ places where I would have agreed with you about the value of radix sorts. You did not suggest radix sorts in that context. Instead, you recommended the sort for a task which it is _not_ suited to. You need to re-read Knuth: he discusses the trade-offs of different sorting techniques. He does _not_ find (as you seem to have concluded) that _one_ technique is _always_ the best choice. > [...] > Please don't say anything if you don't know what you're talking about. Again, please follow this advice yourself. > [...] >> If a whole key (like a 64-bit integer for example) can be compared in >> the same time as it takes to compare your little 8-bit chunk of it, > > Radix sorts do not compare the bytes they are sorting. [...] Yes, I figured you'd seize on this as soon as I posted it. I typed in a hurry and should have said "index with" rather than "compare." It is interesting that you should consider this to be an important point since it doesn't help your case at all. The fact remains that comparing multi-byte objects is, on most machines, just as fast as address calculations using one byte indices. Indeed, the address calculation uses the same functional unit as the multi-byte compare on most machines. Addresses are multi-byte on most machines. > [...] Please don't say > anything if you don't know what you're talking about. Again, take your own advice. Or better yet, I don't care about whether you follow this advice or not. Just QUIT making insulting remarks about others when those _same_ remarks apply more appropriately to _you_. J. Giles