Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!dimacs.rutgers.edu!seismo!uunet!cimshop!davidm
From: cimshop!davidm@uunet.UU.NET (David S. Masterson)
Newsgroups: comp.databases
Subject: Re: Join Contest
Message-ID: <CIMSHOP!DAVIDM.90Jul18104049@uunet.UU.NET>
Date: 18 Jul 90 17:40:49 GMT
References: <49001@seismo.CSS.GOV>
Sender: davidm@cimshop.UUCP
Distribution: usa
Organization: Consilium Inc., Mountain View, California.
Lines: 55
In-reply-to: jean@beno.CSS.GOV's message of 18 Jul 90 01:23:21 GMT

In article <49001@seismo.CSS.GOV> jean@beno.CSS.GOV (Jean Anderson) writes:

   I think "well normalized" is a matter or style and frequently gets confused
   with excessive decomposition.

   [...line eater was here...]

   I may get flamed off the net, but I try to normalize to third normal form, 
   then denormalize for performance.

Well, I'm not going to flame, but some of your definitions have me intrigued.

   Here are several ways I back off:

      1	If there is a one to one relationship between two tables, they
	   share the same primary key and are frequently retrieved together,
	   I consider making them a single table so long as the table doesn't 
	   get too wide. "Too wide" is a matter of style; I start getting 
	   uncomfortable with tables that have more than 20-25 columns.

If the two tables share the same primary key, are they not representative of
the same entity and, therefore, should be one table after normalization?
Especially since they are frequently retrieved together?  What would lead to
separating them?

      2	If a join is used frequently to do a simple lookup, I include
	   that field as redundant data and eliminate the join.  I take a
	   serious look at it to see if this will introduce update anomalies.

To include the data in the primary table, the lookup data must be one-to-one
related with the primary key of the primary table, correct?  If data from a
secondary table is used for a simple lookup on the primary table and that data
is 1-1 related to the primary table, it sounds like the tables were overly
decomposed in the first place.

      3	I use repeating columns to store aggregate information rather than 
	   have it be computed on the fly.

This is a typical solution to a perceived problem.  You've obviously taken
into account the problems of update and the "window of vulnerability" between
change to information and recomputing the aggregate.  I imagine Sybase style
triggers can make this automatic.

   I realize that breaking rules like this would drive a purist nuts, but I 
   answer to a user community that daily wants to see it all go faster with
   the current product on the current platform.

I'm not sure whether your breaking the rules in the second place or going too
far in the first place.
--
===================================================================
David Masterson					Consilium, Inc.
uunet!cimshop!davidm				Mt. View, CA  94043
===================================================================
"If someone thinks they know what I said, then I didn't say it!"