Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!dimacs.rutgers.edu!seismo!uunet!cimshop!davidm From: cimshop!davidm@uunet.UU.NET (David S. Masterson) Newsgroups: comp.databases Subject: Re: Join Contest Message-ID: Date: 18 Jul 90 17:40:49 GMT References: <49001@seismo.CSS.GOV> Sender: davidm@cimshop.UUCP Distribution: usa Organization: Consilium Inc., Mountain View, California. Lines: 55 In-reply-to: jean@beno.CSS.GOV's message of 18 Jul 90 01:23:21 GMT In article <49001@seismo.CSS.GOV> jean@beno.CSS.GOV (Jean Anderson) writes: I think "well normalized" is a matter or style and frequently gets confused with excessive decomposition. [...line eater was here...] I may get flamed off the net, but I try to normalize to third normal form, then denormalize for performance. Well, I'm not going to flame, but some of your definitions have me intrigued. Here are several ways I back off: 1 If there is a one to one relationship between two tables, they share the same primary key and are frequently retrieved together, I consider making them a single table so long as the table doesn't get too wide. "Too wide" is a matter of style; I start getting uncomfortable with tables that have more than 20-25 columns. If the two tables share the same primary key, are they not representative of the same entity and, therefore, should be one table after normalization? Especially since they are frequently retrieved together? What would lead to separating them? 2 If a join is used frequently to do a simple lookup, I include that field as redundant data and eliminate the join. I take a serious look at it to see if this will introduce update anomalies. To include the data in the primary table, the lookup data must be one-to-one related with the primary key of the primary table, correct? If data from a secondary table is used for a simple lookup on the primary table and that data is 1-1 related to the primary table, it sounds like the tables were overly decomposed in the first place. 3 I use repeating columns to store aggregate information rather than have it be computed on the fly. This is a typical solution to a perceived problem. You've obviously taken into account the problems of update and the "window of vulnerability" between change to information and recomputing the aggregate. I imagine Sybase style triggers can make this automatic. I realize that breaking rules like this would drive a purist nuts, but I answer to a user community that daily wants to see it all go faster with the current product on the current platform. I'm not sure whether your breaking the rules in the second place or going too far in the first place. -- =================================================================== David Masterson Consilium, Inc. uunet!cimshop!davidm Mt. View, CA 94043 =================================================================== "If someone thinks they know what I said, then I didn't say it!"