Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!uakari.primate.wisc.edu!sdd.hp.com!hplabs!nsc!pyramid!infmx!cortesi
From: cortesi@infmx.UUCP (David Cortesi)
Newsgroups: comp.databases
Subject: Re: Clarion Professional Developer
Keywords: Clarion review database new speed
Message-ID: <5051@infmx.UUCP>
Date: 23 Aug 90 15:40:05 GMT
References: <1562@wiznet.UUCP>
Reply-To: cortesi@infmx.UUCP (David Cortesi)
Organization: Informix, Menlo Park, Ca. U.S.A.
Lines: 53

In article <1562@wiznet.UUCP> wiz@wiznet.UUCP (Kean Johnston) writes:
>The product I wish to draw your attention to is the Clarion Professional
>Developer, by the Clarion Software Corporation, 150 East Sample Road, Pompano
>Beach, Florida 33064, Tel 305-785-4555, Fax 305-946-1650.
> [ ... ]
>As regards speed, well, Clarion REALLY impressed me here [...] The way we
>did the test was to create, in the respective languages, a program which would
>generate a database of 100 000 records, each of which consisted of 1 six
>character, randomly generated surname, and a 4 character first name field.
>The databases were indexed on the surname. To test the speed of the database
>indexing, we searched for the first surname beginning with an A, then the first
>Z, then the first N. This gives the two extremes of the index, and the middle.
>Paradox took over 13 seconds to get to the Z, and about 7 seconds to get to
>the first N. (I don't know how to time these things exactly). Clarion was ...
>wait for it ... KEYPRESS TIME! [...]
>I then created a 250 000 record database and the speed was NO DIFFERENT!

Seems to me you are making a lot out of a rather undemanding performance
test.  As usually implemented, an inverted index is a B-tree, and to 
look up ANY given key should entail reading no more than d+1 disk pages,
where d is the depth of the tree and the "+1" is for reading the page
that contains the target row.

Lessee: speculate that a disk page is 512 bytes (common in DOS) and that
allowing for overhead there are roughly 40, 6-byte keys per leaf page.
There would be about 2,500 leaf pages in your index. The next level would
have about 65 pages, then two, then the root, so d=4. The first probe for
an arbitrary row should take 5 disk reads, and all subsequent ones should
take either 3 or 2 depending on whether an intermediate page is cached.
So the response time should consistently be the time to do 3 longish
seeks on your hard disk, or roughly "chk-chk-chk-blink."  I cannot imagine
what Paradox is doing that takes multiple seconds.

To verify my guesses, I ran a comparable test using Informix OnLine
executing in a Sun 3/80 (M68030, not a SPARC) with a local SCSI drive.
I made up a dbload file to your recipe (actually I used unix tools to
make up the file from the contents of /usr/dict/words), 100572 unique
rows in all.

You didn't mention how long it took to load this table.  In my setup,
it took 4 min 15 sec to load the table without an index in place, and
an additional 3 min 40 sec to create the index afterward.

As I expected, the execution time for
	SELECT * FROM test_table WHERE lname MATCHES "A*"
(or "Z*" or any other initial letter) was between 1 and 2 seconds.
This is the time to find and display the first screen of 20 matching rows.
I do not intend this as bragging; I would have serious reservations about
any database that didn't do as well on this not-very-difficult test.
This Clarion sounds nice, but with respect to performance you should
maybe test it more thoroughly and compare to better competitors.