Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!sol.ctr.columbia.edu!lll-winken!ames!pasteur!stinson!mao
From: mao@stinson (Mike Olson)
Newsgroups: comp.databases
Subject: Re: INFORMIX SQL functions
Keywords: informix sql
Message-ID: <9227@pasteur.Berkeley.EDU>
Date: 27 Nov 90 02:15:24 GMT
References: <1990Nov20.221658.19496@informix.com> <8335@pbhyf.PacBell.COM> <1990Nov22.210621.7930@tfic.bc.ca> <19917@oolong.la.locus.com>
Sender: news@pasteur.Berkeley.EDU
Reply-To: mao@postgres.Berkeley.EDU (Mike Olson)
Followup-To: comp.databases
Organization: University of California, Berkeley
Lines: 64


In <19917@oolong.la.locus.com>, jfr@locus.com (Jon Rosen) discusses the
need for complex functions generally, and transcendentals specifically,
inside his database engine.  he points out that sql doesn't provide much
that's useful for serious number-crunchers.  He then says

> Is there hope?  Maybe... Mike Stonebraker at Berkeley has proposed
> object-oriented extensions to SQL (no, not making SQL a full OODMBS,
> just some simple extensions) that would allow user defined data types
> and user defined methods or functions on those data types.

which is right; stonebraker's postgres system uses the postquel query
language, which supports user-defined types and procedures.  we've gone
to a fair amount of trouble to guarantee closure, so functions can take
tuples and return tuples (or sets of tuples, in a future release).  this
is to keep the theoreticians out there from foaming at the mouth.  similar
extensions are possible in other query languages.

it isn't very hard to come up with syntax to permit invocations of user
functions.  what is hard (and what i suspect tripped up the sql committee)
is how to store functions, load them, provide strict access controls (very
hard if the user's function runs in the engine's address space), check
types rigorously, and so on.  letting the user write code that the db
engine executes is a major value to add to a system, and that sort of
value doesn't come cheaply.

consider this example from rosen's message:

>    SELECT * FROM BLIVET
>     GROUP BY X
>     HAVING SUM(1 - EXP(-0.023 * A)) > .5  (this may not make sense but 
>                                           it is a good example anyway)

this is a monster to plan and optimize.  you have no idea what this
mysterious EXP function is going to do, assuming that a user wrote
it for you.  it may do a sequential scan of the ten biggest tables
in your db for every tuple you call it with; it may just return
immediately.  it may compute ackerman's function on tuesdays.  you'd
like to know whether you can cache the return value, so you don't
have to call it every time.  does it have side effects?  obviously,
if you're going to do this right, you have to solve a lot of programming
language problems, and a lot of database system problems.
 
now it's soapbox time, so you might want to skip this paragraph:  the
reason that sql isn't extensible is that sql is designed to be the least
common denominator among query languages.  its syntax and semantics were
designed by a committee made up of the major players in the market.  if
some set of them didn't feel they could implement a feature (or didn't
want to give their competitors, who could implement it faster, any
competitive advantage), then the feature was excluded from the language.
those of us in academia can all wear white dresses when we get married,
because we choose our query languages for their power and expressiveness,
and not for the fact that they come in blue boxes.  all those files out
there with the group by's and the having's and the correlated subqueries
drive me nuts.

of course, i understand the trade-off.  i like postquel a lot, but that
means that i have a whole lot of scripts that aren't going to run on
anybody else's database system.

					mike olson
					postgres research group
					uc berkeley
					mao@postgres.berkeley.edu