Path: utzoo!utgpu!news-server.csri.toronto.edu!mailrus!cs.utexas.edu!samsung!rex!wuarchive!uunet!isc-br!hawk!wddami!wayned
From: wayned@wddami.spoami.com (Wayne Diener)
Newsgroups: comp.lang.c
Subject: Re:  Sorting Algorithm
Message-ID: <wayned.0936@wddami.spoami.com>
Date: 23 Aug 90 15:28:11 GMT
References: <wayned.0932@wddami.spoami.com> <3612@goanna.cs.rmit.oz.au>
Sender: uueagle@hawk.isc-br.com (Eagle Proj UUCP login)
Lines: 51

>In article <3612@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:
>In article <wayned.0932@wddami.spoami.com>,
>wayned@wddami.spoami.com (Wayne Diener) writes:

[ text deleted ]

>The C code presented is inordinately complicated, which makes it hard to
>see what is going on.  Basically, methods of inserting something into a
>sequence using binary search fall foul of one of two problems:
>	- if you use an array, you can *get* anywhere fast,
>	  but it takes O(N) time to move other stuff out of the
>	  way to *insert* anything
>	- if you use a linked list, you can *insert* fast,
>	  but it takes O(N) to *get* anywhere by following links.
>(There is a compromise position:  use a binary tree.  Binary trees make
>an excellent representation of sequences, and handle insertion well.
>That would yield a variant of TreeSort.)

Sorry if the code seems complicated.  As I mentioned, I was a _rank_ amateur
at the time.  (Hopefully, I've moved up to _full_ amateur status by now.)
I find the code easy to understand, but perhaps that's because I wrote it.

>
>Wayne Diener's version of the algorithm runs into the second problem.
>To be sure, it is doing O(NlgN) *comparisons*, but in order to get anywhere
>it has to follow pointers O(N**2) times.  The bottle-neck is the code
>
>	for (i = 0; i < middle; i++) {
>	    current = up ? current->prev_word : current->next_word;
>	    if (current == sortbound) { test = 1; break; }
>	}
>
>So it is an O(N**2) member of the family of insertion sorts.

A couple of points here:

1)  The _worst_ case number of pointer movements for each search is:
    N/2 + N/4 + N/8 + .... = N, and it has to do it N times, yielding
    O(N**2).  However, I believe you'll find that the average case
    will yield O(K * log base 2 (N)) (where K is probably about 1.4)
    number of pointer movements per element.

2) Although it yields only a constant improvement (and doesn't change
   the Big O class), the time cost of a pointer movement is (as a 
   normal rule) _substantially_ less than the time cost of the comparison. 
   If the comparison is quite costly, then for reasonable numbers of N
   (say 10K to 50K max?) the efficiency of the sort would still be
   approximately O(N * Log(N)) _even IF the pointer movement is O(N**2)_.

 
--