Path: utzoo!attcan!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uunet!cimshop!davidm From: cimshop!davidm@uunet.UU.NET (David S. Masterson) Newsgroups: comp.databases Subject: Re: Outerjoin implementation? Message-ID: Date: 30 Sep 90 09:16:25 GMT References: <1422@meaddata.meaddata.com> <1990Sep28.225442.4995@tfic.bc.ca> Sender: davidm@cimshop.UUCP Distribution: comp Organization: Consilium Inc., Mountain View, California. Lines: 71 In-reply-to: clh@tfic.bc.ca's message of 28 Sep 90 22:54:42 GMT In article <1990Sep28.225442.4995@tfic.bc.ca> clh@tfic.bc.ca (Chris Hermansen) writes: In article <1422@meaddata.meaddata.com> meaddata!gordon@uunet.uu.net writes: >Absolutely, as a matter of fact, I think the NULL is completely necessary >for the accurate representation of data. If, for example, you have a >integer attribute called status, and you don't currently know the value, >what do you assign to status? I guess you could look through the possible >domain of status and then pick a value outside that domain to represent >unknown, but then you have to heavily document this so maintenance people >know what you did, don't you think? OK, I'll bite... suppose I have a variable called status. Why is NULL better than a value of "unknown" (assuming it's a character value)? I would claim that in fact "unknown" is superior, as it is at least somewhat self documenting (NULL, by definition, is NOT); furthermore, I can easily spot the "unknown" rows in any printout or report; finally, I can easily count the rows of particular status in the table with one select statement (whereas I need two with NULL values allowed, as rows with NULL values do not group together). Wait a sec. Think about that a little. NULL, by definition, is an "unknown" value, so it is self-documenting. Going back to the example of an unknown integer (where 3-valued logic plays an even more important role) in the role of quantity shipped (just hasn't been entered yet), what would you do with a question like "What is the sum of the known amounts shipped?". With the database understanding what a NULL is, the answer can easily be arrived at with just "SELECT SUM(qty) FROM table" with no special checking for -999. I'd be the first to admit that I may have difficulty understanding the use of NULL. But think about your example; why should the input screen have let someone enter the fact that a shipment was made without knowing the number of items shipped? How would you propose to send the customer a bill, if you don't know whether you sent her/him 3 SPARCStations or 3000? Now suppose that you're the boss of the shipping department, and you find out that your programmer decided to allow NULLs in the amount_shipped field by casually noting that your warehouse contained no more SPARCStations because someone on the loading dock didn't bother to fill in the waybill properly and the fast-knuckled keypuncher blithely skipped over the amount_shipped field, so not only are you going to have to call 200 customers to find out how many you sent them, but you're going to have to call head office and tell them that your order for more stock should have had more than zero items on it. In some places, this can (and is expected to) happen on a temporary basis. Usually, though, there are physical (ie. non-computer) checks to make sure that this is only a temporary phenomena. In such organizations, its better to allow them to enter "No Value" than to force them to reinvent the way they do business. And in any case, I would claim that a zero would do as well as a NULL here, since most shipping departments would hardly bother to ship zero units (let alone generate a shipping report with zero units on it). Placing a zero here would confuse Accounting, though, as it would look like a fulfilled shipment. Point being, zero (or any "in-phase" data) has other connotations that programs outside the shipping department might treat differently. Everyone, though, would view a NULL as a NULL. Also, don't forget to factor in the idea of constraints into the use of NULLs. With the single value, all referential constraints remain in balance ("I am NULL, therefore I don't refer to anything"), so there is no special processing for semantics ("I'm a status value, therefore if I am 'unknown', I don't refer to the list of known, possible statuses"). BTW, constraints are the way you should ensure that the wrong value is never entered (as in "NOT NULL"). -- ==================================================================== David Masterson Consilium, Inc. uunet!cimshop!davidm Mtn. View, CA 94043 ==================================================================== "If someone thinks they know what I said, then I didn't say it!"