Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!uflorida!haven!vrdxhq!daitc!daitc.daitc.mil From: jkrueger@daitc.daitc.mil (Jonathan Krueger) Newsgroups: comp.databases Subject: Re: How do you... Message-ID: <466@daitc.daitc.mil> Date: 28 Mar 89 21:13:52 GMT References: <1707@anasaz.UUCP> Sender: jkrueger@daitc.daitc.mil Reply-To: jkrueger@daitc.daitc.mil (Jonathan Krueger) Organization: Defense Applied Information Technology Center Lines: 60 In-reply-to: john@anasaz.UUCP (John Moore) In article <1707@anasaz.UUCP>, john@anasaz (John Moore) writes: >I have a relation that consists of a large (>500,000) set of last and >first names, and an application that needs to do partial searches of >the list. For example, the query might be to find names starting with >"mo". A further restriction is that it must be in alphabetical order. >Finally (and the only tough one for me) is that I only want the first >20 names (for display on a screen). There may be 10,000 names that >meet the criteria. Also, the list is dynamic - names are continuously >being added and deleted by multiple processes. > >If the relation is indexed on last name, it should be possible to >do this, but how do I (in SQL in general, and Informix-Turbo in particular): > >(1) Make sure that the set of all tuples matching the query not be sorted > before I get the records (i.e. it is prohibitively expensive if the > database engine extracts a temporary file of the 10,000 > names, sorts them, and then presents them to my process). No problem, just don't ask for the sort. The relational model specifically does not specify top-to-bottom ordering of rows. If a particular implementation guarentees retrieval in a default sorting as a side effect of a storage structure, retrievals may return rows in a sort order with no extra overhead But this is non-portable, of course. But you said you didn't care if they were sorted, you just wanted no extra cost for it. BTW, the database engine that can't sort 10K names efficiently is not worth bothering about. >(2) Limit the query to the first 20 names - it is also too expensive > if I get 10,000 names and just throw away the last 9980 of them. >(3) Later on, without holding open a cursor, get the NEXT 20 names. No way to express "first n matches" in relational calculus. Practically, when embedded query languages define how retrievals connect to the semantics of host language loops, there may be a safe way to break off the query, or to perform other work for each row retrieved, or for each group of rows retrieved. For example: #define BUNCH 20 /* how many rows to get at a time */ #define LIMIT 200 /* if this many or more returned, abort */ ## char last[20], first[20]; int ctr = 0; ## range of n is names ## retrieve (last = n.lname, first = n.fname) ## { if (++ctr > LIMIT) ## endretrieve if (ctr%BUNCH == 0) /* do stuff */ ## } Note that this works only if the "stuff" can be embedded inline. If you need instead the semantics of a function that gets called externally and returns the next 20 rows each time, this syntax isn't powerful enough. You need something that "saves context". In SQL such context savers are known as cursors. -- Jon --