Path: utzoo!censor!geac!torsqnt!news-server.csri.toronto.edu!rutgers!uwm.edu!rpi!sci.ccny.cuny.edu!phri!cmcl2!kramden.acf.nyu.edu!brnstnd From: brnstnd@kramden.acf.nyu.edu (Dan Bernstein) Newsgroups: comp.lang.misc Subject: Re: Sorting bounds Message-ID: <6337:Nov2913:02:3690@kramden.acf.nyu.edu> Date: 29 Nov 90 13:02:36 GMT References: <37256@nigel.ee.udel.edu> Organization: IR Lines: 29 In article peter@ficc.ferranti.com (Peter da Silva) writes: > In article rang@cs.wisc.edu (Anton Rang) writes: > > In article <37256@nigel.ee.udel.edu> carroll@cis.udel.edu (Mark Carroll) writes: > > >Saying that sorting is "linear in the number of bytes" is truly foolish. > > >It's nothing but a deceptive way of saying "Sorting takes time O(n lg n) > > >in the number of keys". > > This assumes that all keys are distinct. > If they're not you can take an initial O(n) pass sticking all the equal > keys together. How, pray tell, do you plan to do that? The only way I can think of is to sort the keys (which takes time O(b), b the number of bytes). > Now your problem becomes sorting (n') new keys, which is > again O(n' log n'). Ah. I understand. It's heretical to take bytes as the unit of measure, because the bound O(b) on sorting time is too precise, and computer scientists hate precision. Instead, it's better to write sorting in terms of a different quantity---as Jim says, n, the number of keys. Of course, sorting may be faster than n log n, or slower; and n has no simple relation to b in general. But that's still not imprecise enough for you. You want to take n', the number of distinct keys. Now you have an n' log n' bound which is correct even less often than n log n, and besides that n' is a pain to compute. This seems to be typical of progress in computer ``science.'' ---Dan Brought to you by Super Global Mega Corp .com