Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!dimacs.rutgers.edu!seismo!beno.CSS.GOV From: jean@beno.CSS.GOV (Jean Anderson) Newsgroups: comp.databases Subject: Re: Join Contest Message-ID: <49001@seismo.CSS.GOV> Date: 18 Jul 90 01:23:21 GMT Sender: usenet@seismo.CSS.GOV Distribution: usa Organization: Center for Seismic Studies, Arlington, VA Lines: 50 Mike Tossy writes: > In article , cimshop!davidm@uunet.UU.NET (David S. Masterson) writes: > > In article <12072@blia.BLI.COM> miket@blia.BLI.COM (Mike Tossy) writes: > > > > My observation is that the number of tables used by an application > > increases dramaticly if the database design is well normalized. > > > > Is that good or bad? (Speaking from your observations that is :-) > > I think it is good. The normal reason to denormalize is to gain > performance. Assuming that your RDBMS is good at joins this is not > necesary, and you can even gain speed by normalizing because the resulting I think "well normalized" is a matter or style and frequently gets confused with excessive decomposition. I have seen tables that were so decomposed they consisted of little more than primary and foreign keys. This may be OK for an application which is entirely canned, but it makes the logical design artificially complex for any database which has to support interactive SQL users. Also, my experience with two different products so far tells me that the higher the number of joins, the longer the query takes to run. With a large amount of data or complex queries, this can become prohibitive. I may get flamed off the net, but I try to normalize to third normal form, then denormalize for performance. Here are several ways I back off: 1 If there is a one to one relationship between two tables, they share the same primary key and are frequently retrieved together, I consider making them a single table so long as the table doesn't get too wide. "Too wide" is a matter of style; I start getting uncomfortable with tables that have more than 20-25 columns. 2 If a join is used frequently to do a simple lookup, I include that field as redundant data and eliminate the join. I take a serious look at it to see if this will introduce update anomalies. 3 I use repeating columns to store aggregate information rather than have it be computed on the fly. I realize that breaking rules like this would drive a purist nuts, but I answer to a user community that daily wants to see it all go faster with the current product on the current platform. - Jean Anderson SAIC, Geophysics Division jean@seismo.css.gov +++++++++++++++++++++++++++++++++++++++++++++++++ ++++++++++++++++++ Any opinions are mine, not my employer's. +++++++++++++++++ +++++++++++++++++++++++++++++++++++++++++++++++++