Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!usc!jarthur!elroy.jpl.nasa.gov!hacgate!ashtate!dbase!awd From: awd@dbase.A-T.COM (Alastair Dallas) Newsgroups: comp.databases Subject: Re: clipper internals Summary: Smaller keys Keywords: clipper dbase index Message-ID: <441@dbase.A-T.COM> Date: 26 Feb 90 17:49:41 GMT References: <13221@cbnewse.ATT.COM> Distribution: na Organization: Ashton Tate Development Center Glendale, Calif. Lines: 26 Congratulations! You've managed to hit on precisely what my management's lawyers mean when they speak of "proprietary information." Sorry for being flip, but this is in reply to mail that asked question after question pertaining to the exact nature of Clipper (and by extension dBASE) operations and there's just no way I can be forthcoming. I can say that the main speed cost in any PC database system is reading the disk. Nothing else (string compare vs numeric compare) comes close to affecting the bottom line speed so profoundly as being able to avoid "hitting the disk" even once. Therefore, by keeping your index keys small you allow the system to pack more of them into a fixed-length block (dBASE IV supports adjustable block sizes), which ultimately reduces the number of disk reads (especially for SKIP operations). If you want to get really tricky, write code that hashes your key values into a 4-byte long and index on a UDF that uses this value to build a 4-byte Character string. That'll let you SKIP 40 times or so without reading another index node. The other thing I _can_ say is that you might look at Knuth's "Art of Computer Programming," Vol. 3: Sorting and Searching. It describes the operation of Clipper's and dBASE's indexing in sufficient abstraction so as not to perturb the lawyers. Hope it helps. /alastair/