Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!uakari.primate.wisc.edu!sdd.hp.com!hplabs!nsc!pyramid!infmx!cortesi From: cortesi@infmx.UUCP (David Cortesi) Newsgroups: comp.databases Subject: Re: Clarion Professional Developer Keywords: Clarion review database new speed Message-ID: <5051@infmx.UUCP> Date: 23 Aug 90 15:40:05 GMT References: <1562@wiznet.UUCP> Reply-To: cortesi@infmx.UUCP (David Cortesi) Organization: Informix, Menlo Park, Ca. U.S.A. Lines: 53 In article <1562@wiznet.UUCP> wiz@wiznet.UUCP (Kean Johnston) writes: >The product I wish to draw your attention to is the Clarion Professional >Developer, by the Clarion Software Corporation, 150 East Sample Road, Pompano >Beach, Florida 33064, Tel 305-785-4555, Fax 305-946-1650. > [ ... ] >As regards speed, well, Clarion REALLY impressed me here [...] The way we >did the test was to create, in the respective languages, a program which would >generate a database of 100 000 records, each of which consisted of 1 six >character, randomly generated surname, and a 4 character first name field. >The databases were indexed on the surname. To test the speed of the database >indexing, we searched for the first surname beginning with an A, then the first >Z, then the first N. This gives the two extremes of the index, and the middle. >Paradox took over 13 seconds to get to the Z, and about 7 seconds to get to >the first N. (I don't know how to time these things exactly). Clarion was ... >wait for it ... KEYPRESS TIME! [...] >I then created a 250 000 record database and the speed was NO DIFFERENT! Seems to me you are making a lot out of a rather undemanding performance test. As usually implemented, an inverted index is a B-tree, and to look up ANY given key should entail reading no more than d+1 disk pages, where d is the depth of the tree and the "+1" is for reading the page that contains the target row. Lessee: speculate that a disk page is 512 bytes (common in DOS) and that allowing for overhead there are roughly 40, 6-byte keys per leaf page. There would be about 2,500 leaf pages in your index. The next level would have about 65 pages, then two, then the root, so d=4. The first probe for an arbitrary row should take 5 disk reads, and all subsequent ones should take either 3 or 2 depending on whether an intermediate page is cached. So the response time should consistently be the time to do 3 longish seeks on your hard disk, or roughly "chk-chk-chk-blink." I cannot imagine what Paradox is doing that takes multiple seconds. To verify my guesses, I ran a comparable test using Informix OnLine executing in a Sun 3/80 (M68030, not a SPARC) with a local SCSI drive. I made up a dbload file to your recipe (actually I used unix tools to make up the file from the contents of /usr/dict/words), 100572 unique rows in all. You didn't mention how long it took to load this table. In my setup, it took 4 min 15 sec to load the table without an index in place, and an additional 3 min 40 sec to create the index afterward. As I expected, the execution time for SELECT * FROM test_table WHERE lname MATCHES "A*" (or "Z*" or any other initial letter) was between 1 and 2 seconds. This is the time to find and display the first screen of 20 matching rows. I do not intend this as bragging; I would have serious reservations about any database that didn't do as well on this not-very-difficult test. This Clarion sounds nice, but with respect to performance you should maybe test it more thoroughly and compare to better competitors.