Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uunet!meaddata!gordon
From: gordon@meaddata.com (Gordon Edwards)
Newsgroups: comp.databases
Subject: Re: Outerjoin implementation?
Message-ID: <1422@meaddata.meaddata.com>
Date: 21 Sep 90 13:17:59 GMT
Sender: usenet@meaddata.com
Reply-To: meaddata!gordon@uunet.uu.net
Followup-To: comp.databases
Distribution: comp
Organization: Mead Data Central, Dayton OH
Lines: 102


In article <1990Sep20.161204.13014@tfic.bc.ca>, clh@tfic.bc.ca (Chris
Hermansen) writes:
|> In article <CIMSHOP!DAVIDM.90Sep19100915@uunet.UU.NET>
cimshop!davidm@uunet.UU.NET (David S. Masterson) writes:
|> >In article <PCG.90Sep18192004@odin.cs.aber.ac.uk> pcg@cs.aber.ac.uk
|> >(Piercarlo Grandi) writes:
|> >
|> >   It may be preferable to avoid the difficult issues posed by null values,
|> 
|> Yay!  Someone else who hates nulls!!
|> 
|> >   and the definition of an outer join operation, by attaching instead to
|> >   each relation (or to the underlying domains) a default value chosen by
|> >   the schema designer, and/or to make an outerjoin just a join where a
|> >   suitable matching default value is automatically supplied.


IF default values are appropriate, THEN they should be used.  If no default
exists or is inappropriate, then you are compromising the accuracy of your
model.


|> >
|> >This method *may* obviate the need for null records that occur in
outer joins,
|> >but it doesn't obviate the need for NULLs as attribute values.  In
fact, this
|> >is the method documented by Ingres for doing outer joins in Quel.  Using
|> >default values, though, has always had the problem of finding a
default value
|> >that will *never* be used as a live value.


Absolutely, as a matter of fact, I think the NULL is completely necessary for
the accurate representation of data.  If, for example, you have a integer
attribute called status, and you don't currently know the value, what
do you assign to status?  I guess you could look through the possible domain
of status and then pick a value outside that domain to represent unknown,
but then you have to heavily document this so maintenance people know what
you did, don't you think?


|> 
|> NULLs almost guarantee strange `gotchas', don't you think?  For instance,
|> instead of writing a statement like
|> 
|> 	update foo set bar = 37 where qqsv != 0
|>
|> the SQLer has to decide if
|> 
|> 	update foo set bar = 37 where qqsv != 0 or qqsv is null
|> 
|> is what s/he REALLY means.  Put another way,
|> 
|> 	update foo set bar = 37 where qqsv != 0 and qqsv isnt null
|> 
|> provides no extra degree of refinement over the first update
|> statement, even though
|> it appears to (for the casual observer, at least).


Well the above is really placing two distinct constraints on the update.
Constraint 1:  qqsv cannot have a known value of 0.
Constraint 2:  qqsv cannot have an unknown value.

If qqsv was guaranteed to be known, you can always declare it as *not null*.
Then, Constraint 2 would become unecessary.  If you suggest an arbitrary value
for unknowns, what would the statement be?

	ubdate foo set bar = 37 where qqsv != 0 and qqsv != -99999999

This is easier than using null?  I don't think so.


|> 
|> [text deleted...]
|> 
|> Note that I use Informix SQL, so your lack of mileage may vary on the above


????? The above statement makes no sense. ?????

Maybe you have difficulty undersanding the proper use of null.  It means, "I
don't know what this value is."  I'll use an example from Date, "... a 
shipment record might contain a null quantity value (we know that the 
shipment exists, but we do not know the quantity shipped)..."

If you find that you have lots of nulls, then maybe your schema is not
properly 
normalized.  Try checking for transitive dependencies, I have found that they
can contribute to an excessive amount of nulls.

Null is a very important semantic concept.  There have been many times I would
have loved to have null when writing 3GL applications (I had to rely on -999).
As for additional constraints, anytime you deal with an uknown you have take
extra measures to account for it.

What do you think?

--
Gordon S. Edwards             gordon@meaddata.com             Mead Data Central