Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uunet!meaddata!gordon From: gordon@meaddata.com (Gordon Edwards) Newsgroups: comp.databases Subject: Re: Outerjoin implementation? Message-ID: <1422@meaddata.meaddata.com> Date: 21 Sep 90 13:17:59 GMT Sender: usenet@meaddata.com Reply-To: meaddata!gordon@uunet.uu.net Followup-To: comp.databases Distribution: comp Organization: Mead Data Central, Dayton OH Lines: 102 In article <1990Sep20.161204.13014@tfic.bc.ca>, clh@tfic.bc.ca (Chris Hermansen) writes: |> In article cimshop!davidm@uunet.UU.NET (David S. Masterson) writes: |> >In article pcg@cs.aber.ac.uk |> >(Piercarlo Grandi) writes: |> > |> > It may be preferable to avoid the difficult issues posed by null values, |> |> Yay! Someone else who hates nulls!! |> |> > and the definition of an outer join operation, by attaching instead to |> > each relation (or to the underlying domains) a default value chosen by |> > the schema designer, and/or to make an outerjoin just a join where a |> > suitable matching default value is automatically supplied. IF default values are appropriate, THEN they should be used. If no default exists or is inappropriate, then you are compromising the accuracy of your model. |> > |> >This method *may* obviate the need for null records that occur in outer joins, |> >but it doesn't obviate the need for NULLs as attribute values. In fact, this |> >is the method documented by Ingres for doing outer joins in Quel. Using |> >default values, though, has always had the problem of finding a default value |> >that will *never* be used as a live value. Absolutely, as a matter of fact, I think the NULL is completely necessary for the accurate representation of data. If, for example, you have a integer attribute called status, and you don't currently know the value, what do you assign to status? I guess you could look through the possible domain of status and then pick a value outside that domain to represent unknown, but then you have to heavily document this so maintenance people know what you did, don't you think? |> |> NULLs almost guarantee strange `gotchas', don't you think? For instance, |> instead of writing a statement like |> |> update foo set bar = 37 where qqsv != 0 |> |> the SQLer has to decide if |> |> update foo set bar = 37 where qqsv != 0 or qqsv is null |> |> is what s/he REALLY means. Put another way, |> |> update foo set bar = 37 where qqsv != 0 and qqsv isnt null |> |> provides no extra degree of refinement over the first update |> statement, even though |> it appears to (for the casual observer, at least). Well the above is really placing two distinct constraints on the update. Constraint 1: qqsv cannot have a known value of 0. Constraint 2: qqsv cannot have an unknown value. If qqsv was guaranteed to be known, you can always declare it as *not null*. Then, Constraint 2 would become unecessary. If you suggest an arbitrary value for unknowns, what would the statement be? ubdate foo set bar = 37 where qqsv != 0 and qqsv != -99999999 This is easier than using null? I don't think so. |> |> [text deleted...] |> |> Note that I use Informix SQL, so your lack of mileage may vary on the above ????? The above statement makes no sense. ????? Maybe you have difficulty undersanding the proper use of null. It means, "I don't know what this value is." I'll use an example from Date, "... a shipment record might contain a null quantity value (we know that the shipment exists, but we do not know the quantity shipped)..." If you find that you have lots of nulls, then maybe your schema is not properly normalized. Try checking for transitive dependencies, I have found that they can contribute to an excessive amount of nulls. Null is a very important semantic concept. There have been many times I would have loved to have null when writing 3GL applications (I had to rely on -999). As for additional constraints, anytime you deal with an uknown you have take extra measures to account for it. What do you think? -- Gordon S. Edwards gordon@meaddata.com Mead Data Central