Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!rutgers!apple!voder!cullsj!gupta From: gupta@cullsj.UUCP (Yogesh Gupta) Newsgroups: comp.databases Subject: Re: How do you... Summary: Use an RDBMS with a decent optimizer. Use cursors. Message-ID: <513@cullsj.UUCP> Date: 29 Mar 89 03:35:55 GMT References: <1707@anasaz.UUCP> Organization: Cullinet Software, San Jose, CA Lines: 64 In article <1707@anasaz.UUCP>, john@anasaz.UUCP (John Moore) writes: > Context: Relational Database in Transaction Processing (high performance) > > I have a relation that consists of a large (>500,000) set of last and > first names, and an application that needs to do partial searches of > the list. For example, the query might be to find names starting with > "mo". A further restriction is that it must be in alphabetical order. > Finally (and the only tough one for me) is that I only want the first > 20 names (for display on a screen). There may be 10,000 names that > meet the criteria. Also, the list is dynamic - names are continuously > being added and deleted by multiple processes. > > If the relation is indexed on last name, it should be possible to > do this, but how do I (in SQL in general, and Informix-Turbo in particular): > > (1) Make sure that the set of all tuples matching the query not be sorted > before I get the records (i.e. it is prohibitively expensive if the > database engine extracts a temporary file of the 10,000 > names, sorts them, and then presents them to my process). If your partial search on names contains known prefixes* (like the example you gave) and there is an index on that column, and the index is worth using**, any good optimizer will figure out that the sort is unnecessary. e.g. Table t1 has index on last_name. select * from t1 where last_name like 'smi%' order by last_name; should do the trick. > (2) Limit the query to the first 20 names - it is also too expensive > if I get 10,000 names and just throw away the last 9980 of them. In the C program, just declare a cursor for the above select, open it, and fetch the first 20 rows only. > (3) Later on, without holding open a cursor, get the NEXT 20 names. Why not keep the cursor open? After all, that is what lets the DBMS know where you were in your select. If you are concerned about locking the data that has been read, note that not keeping the cursor open implies that the first 20 rows may have changed by the time you look for the "next 20". If you are willing to live with this, you need a DBMS that supports the Cursor Stability mode of consistency. Then, only the last row fetched by this cursor will be locked (if the DBMS supports row-level locking, otherwise the page containing the last row fetched will be locked). > particular case of Informix-Turbo with C embedded SQL. Don't know whether Informix-Turbo optimizer eliminates the sort or not when it is not necessary. Also, do not know whether it supports cursor stability, row-level locking ... -- * If the partial searches are of the form (name = "%son") (known suffixes), I do not know of any system that will help you. ** An index may not be worth using given the statistics about the information, the physical clustering of the data, and the given query. -- Yogesh Gupta | If you think my company will let me Cullinet Software, Inc. | speak for them, you must be joking.