Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!cs.utexas.edu!asuvax!anasaz!john
From: john@anasaz.UUCP (John Moore)
Newsgroups: comp.databases
Subject: Re: Simple SQL question
Keywords: SQL
Message-ID: <926@anasaz.UUCP>
Date: 22 Nov 89 05:05:45 GMT
References: <98813@ti-csl.csc.ti.com> <1989Nov19.200834.266@mrspoc.Transact.COM> <1217@hpuviea.UUCP>
Reply-To: john@anasaz.UUCP (John Moore)
Organization: Anasazi Inc, Phoenix AZ
Lines: 81

In article <1217@hpuviea.UUCP> gernot@hpuviea.UUCP (Gernot Kunz) writes:
]itkin@mrspoc.Transact.COM (Steven M. List) writes:
]
]>hemphill@csc000.csc.ti.com (Charles Hemphill) writes:
]
]>>	List the 10 oldest employees.
]
]>	select name, birthdate
]>	from employee
]>	order by birthdate desc
]>and pipe the result through "head -10"?
]>Otherwise, there's no way to set a limit in SQL.
]
]Oops, what a tricky UNIX hack to circumvent an obviously
]missing SQL functionality. In ORACLE you could do THIS:
]
]       create table temptable(name ..., birthdate ...);
]
]       insert into temptable
]       select name, birthdate
]       from employee
]       order by birthdate desc;
]
]       select name,birthdate
]       from temptable
]       where rownum <= 10 ;
]
]and remain in pure SQL. It requires a temporary table, though.

In practical terms, however, this means that all rows were
read from the database, inserted into the temptable, after which
ten rows were extracted. If one is concerned about per-query
performance (as we are when we are doing 20 queries
per second!), this doesn't do the job.

Try this for an even harder one:

Table Frod:
    Location:
    Name:
    Day_of_Birth:

Query: Give me the first ten names by date of birth where
  location is one of a set of locations which is a small subset
  of all the locations.

Does anyone know how do do this one (you may choose any index you want)
in either SQL or embedded SQL without the RDBMS having to do a lot of
extra work? One way I know is to have N cursors open, one per
location, and then do a "merge" on them. This, however, is hard to
generalize in embedded C (especially in Informix, which we are using).
Even then, RDBMS' tend to do read-ahead on cursors, causing extra I/O. 
The other approach I thought of is to write a query like the one below, 
and hope the optimizer is smart enough to make it efficient.

Does someone have an optimizer which would generate a GOOD strategy for:
  SELECT Name, Location 
    FROM Frod 
    WHERE Location in ( "Phoenix", "Atlanta", "Denver", "Seattle" )
     AND  Day_of_Birth >= 12/31/47
    ORDER BY Day_of_Birth ;

Note: 500 locations, 1000 names at each, days distributed unevenly
  over 1 year from 6/30/46 to 6/30/47.

CREATE INDEX ...any way you want, but keep in mind that this data is 
   dynamic, with lots of insertions to the table.

Definition of good: performance >= what could be achieved by doing it
  by hand using low level operations.

If your optimizer generates a nice strategy for this, please
post what the strategy would be - especially if there are
neat tricks. I am assuming that the RDBMS has RAM cache,
logging etc. as are used in OLTP versions such as Oracle 6
or Informix Turbo or others.
-- 
John Moore (NJ7E)           mcdphx!anasaz!john asuvax!anasaz!john
(602) 861-7607 (day or eve) long palladium, short petroleum
7525 Clearwater Pkwy, Scottsdale, AZ 85253
The 2nd amendment is about military weapons, NOT JUST hunting weapons!