Path: utzoo!news-server.csri.toronto.edu!cs.utexas.edu!wuarchive!uwm.edu!psuvax1!psuvm!awiwuw11!nn1 From: NN1@awiwuw11.wu-wien.ac.at Newsgroups: comp.theory Subject: Re: lower-bound for searching sorted sequences (was: binary search) Message-ID: <91069.155112NN1@awiwuw11.wu-wien.ac.at> Date: 11 Mar 91 00:34:16 GMT References: <56531@eerie.acsu.Buffalo.EDU> <56707@eerie.acsu.Buffalo.EDU> <8274@ucdavis.ucdavis.edu> Organization: Wirtschaftsuniversitaet Wien, Vienna, Austria Lines: 44 It was already pointed out that the best lower bound for the complexity depends on the model of computation used. One possibility is to regard the array components as abstract atomic objects, and that the only possible operations are comparisons and assignments of objects as a whole; in this case the lower bound for the number of comparisons is O(log n). The algorithm of average complexity O(log log n) assumes some additional structure (arithmetic for integers or accessing individual components of lexically ordered strings). In this case it may be more natural, however, to count the number of bit operations rather than the number of comparisons, and then the lower bound, even for the average complexity, will again be O(log n), unless the number of DIFFERENT array entries is of much lower order than n. Assume, for instance, that the array entries are strings of length l, ordered lexicographically, and that there only two different symbols. (This includes also the case of integers in binary representation, or strings over a larger alphabet, provided that the ordering of the symbols coincides with lexicographical one of their representation as bit combinations.) Now, the comparison of two such strings can be done within constant average complexity, but the same is not longer true for the arithmetical operations required for interpolation. This complexity depends also on l, and l must be >= log n, if all array entries are different.The proof that the number of bit operations cannot be smaller than O(log n) is the same as in the case where the array entries are considered unstructured. It must be admitted, though, that it depends on the assumption that strings are distributed evenly. In fact, the O(log n) barrier can be broken, if the probability distribution of the inputs is rather peculiar, for instance, if the element to be searched for is almost always smaller than the smallest array element. The whole situation is analogous to that for sorting: In this case the lower bound is O(n log n), if the array elements are assumed to be unstructured. Without this assumption the RADIXSORT-algorithm has linear complexity, but only if the number of bits required to represent the array entries is considered constant. The latter assumption cannot be maintained if we let n approach infinity, and if we assume that all (or most) entries are different. All this does not mean that the algorithms requiring an asymptotically low number of comparisons can not be superior in practice to those where the number of comparisons is O(log n) or O(n log n) respectively. My only point is that from a theoretical point of view the bounds log n or n log n respectively do remain valid as far as the number of bit operations is concerned. Heinrich Rolletschek RISC-Linz K310670@AEARN