Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uunet!zephyr.ens.tek.com!tekig5!philj
From: philj@tekig5.PEN.TEK.COM (Phil K Jansen)
Newsgroups: comp.sys.mac.programmer
Subject: Re: fastqsort.c -- 3 times faster than THINK C's qsort()
Message-ID: <7004@tekig5.PEN.TEK.COM>
Date: 26 Sep 90 00:46:41 GMT
References: <60123@iuvax.cs.indiana.edu> <1990Sep25.044455.3039@Neon.Stanford.EDU> <7739@ucdavis.ucdavis.edu>
Reply-To: philj@tekig5.PEN.TEK.COM (Phil K Jansen)
Organization: Tektronix, Inc., Beaverton,  OR.
Lines: 54

Haydn Huntley writes:
>One problem with qsort is that when the data is not in random order ...
>... worst case behavior of N*N/2 operations.  For N=1024 this can 
>take about 30 seconds, instead of the usual 0.8 seconds.
>[fastqsort doesn't do this because of random pivot]...

Marc T. Kaufman writes:
>... I really have to object [to the comment that random selection eliminates
> the 'worst case' run time.  It just changes where worst-case will be.]

A simple 'proof' that worst-case is still O(N^2) is to assume that random
pivots happen to be chosen in exactly the same place as before.  It *can*
happen, you know.

Lloyd Lim writes:
>Marc is correct ... [justification for some improvement with a random pivot]

>A better way is to pick the median of three elements ...

Haydn actually does use "Median of three" for his pivot (check the #ifdefs).
His documentation is a little out of date, though... :-).

Our Algorithms class demonstrated that the insertion sort was faster
for less than ~25 elements.  This is because you start to notice quickSort's
subroutine overhead for each element actually moved.

So what you really want to do is CHANGE THE SORT ALGORITHM for small lists.
This will improve worst-case behavior to near the insertion-sort level.

hyperQuickSort(arguments) {
	if(numElements > 25)
	    quickSort(arguments);
	        /* recursively calls hyperQuickSort() -- twice */
	else
	    insertionSort(arguments); /* good ol' dumb insertion sort */
}

I expect most of the speedup came from the asm{} statements for the swaps.

As Haydn said, he did get 3x improvement from optimizing the algorithm.  But
he also made it non-portable.  Of course we're all 68000 Mac users here, so
maybe portability is less of an issue.  I need it, though.

It's always worth verifying the algorithm you used is the one you want before
you start optimizing.  I should know -- I've written REALLY SLOW assembler
stuff.

Good luck.

Phil Jansen
-- 
                          If you repeat things often enough, they become true.
Phil Jansen               If you repeat things often enough, they become true.
philj@tekig5.pen.tek.com  If you repeat things often enough, they become true.