Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!usc!elroy.jpl.nasa.gov!jarthur!nntp-server.caltech.edu!laguna.ccsf.caltech.edu!daveg From: daveg@near.cs.caltech.edu (Dave Gillespie) Newsgroups: alt.sources Subject: Re: Sorting Algorithm Message-ID: Date: 25 Aug 90 00:57:26 GMT References: <15890@oolong.la.locus.com> Sender: news@laguna.ccsf.caltech.edu Followup-To: alt.sources.d Organization: California Institute of Technology Lines: 36 The binary-searching insertion sort is one of those things that seem like they ought to work, but somehow the data structures invoke Murphy's Law no matter which way you turn. Knuth discusses it in the case of sorting arrays (_Art_of_Computer_Programming_, Volume III, section 5.2.1) and notes that even though the search time is reduced to O(N log N), the time to shift around the array elements still dominates at O(N^2). Your suggestion of using linked lists eliminates the time to shift the array, but now the search is at least O(N^2) due to the list traversals. Both the array shifting and the pointer traversal overheads are likely to be either negliglble or (more likely) not negligible in the same situations. My guess is that Victor Kan's fix will indeed turn out to require shuffling O(log N) pointers at each step, but at enough cost per pointer that the overall method is again O(N^2). The problem is that, if the input is randomly distributed, each binary-search path may be wildly different from the previous one, so the list must still be traversed a significant distance most of the time. I hope you can prove me wrong, Victor! I remember seeing a clever technique in the June 1990 issue of "Communications of the ACM" for doing binary-search-like things with linked lists. They augmented the linked list structure with long-distance links. So every element points to the next element, some also point to the second element after them, fewer still also point to the fourth following element, and so on. They showed that if you build this structure probabilistically, it's still relatively efficient in both time and space (which can easily be traded against each other) and also easy to manipulate. Pretty cool stuff. Anyhow, I've addressed followups to alt.sources.d because that seemed a more appropriate place for the discussion. -- Dave -- Dave Gillespie 256-80 Caltech Pasadena CA USA 91125 daveg@csvax.cs.caltech.edu, ...!cit-vax!daveg