Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!apple!voder!blia!miket From: miket@blia.BLI.COM (Mike Tossy) Newsgroups: comp.databases Subject: Re: Balancing Normalization with Performance Message-ID: <13007@blia.BLI.COM> Date: 26 Nov 90 22:37:52 GMT References: <1092@cortex.med.jhu.edu> <1990Nov20.190315.13505@informix.com> Distribution: usa Organization: Britton Lee, Los Gatos, CA Lines: 47 In article <1990Nov20.190315.13505@informix.com>, randall@informix.com (Randall Rhea) writes: > In article <1092@cortex.med.jhu.edu> jamesr@cortex.med.jhu.edu (Jim Rosenberg) writes: >>Hello database gurus. >> >>I am building my first database, and am trying to understand how to balance >>the benefits of normalization with performance requirements. I am >>interested in any heuristics or literate on deciding when to scrap >>normalization to get a performance kick. * * * >>An example issue I am trying to answer: >> I have a domain that can have one of a given set of character string values. >> I am thinking of creating a relation that equates each string with >> an integer ID. The benefits would be less memory used in the relation with >> that domain (since an integer would be stored for each tuple rather than >> an n byte length string), faster comparisons in a qualification (since >> comparisons on that domain would be between integers and not strings), and >> less chance of typo's in the database (since could not enter a string that >> cannot be translated to an ID by a subquery into the ID relation). The >> cost, of course, is the subquery required whenever you want to access the >> domain. Does the answer depend on the length of the strings, etc ... > > A great example! It shows how tricky the whole thing can be. If there were > only 10 possible values for your string, then I would certainly > normalize and use the integer ID that you describe. * * * > If there were only 10 possible string values, I WOULD normalize because: * * * Please, please, please! Don't call this normalization! You are encoding! Encoding is a very useful database techinque which has all the advantages and disadvanteges you discuss; but it is unrelated to the process of normalization. Also, normalization does not always (or even normally) result in performance degradation. Frequently normalization can improve system performance. -- Mike Tossy ShareBase Coropration miket@blia.bli.com 14600 Wichester Blvd (408) 378-7575 ext2200 Los Gatos, CA 95030 (ShareBase is a subsidiary of Teradata Corportation)