Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!rochester!cornell!uw-beaver!mit-eddie!wuarchive!zaphod.mps.ohio-state.edu!uakari.primate.wisc.edu!aplcen!boingo.med.jhu.edu!cortex!jamesr From: jamesr@cortex.med.jhu.edu (Jim Rosenberg) Newsgroups: comp.databases Subject: Balancing Normalization with Performance Message-ID: <1092@cortex.med.jhu.edu> Date: 19 Nov 90 17:12:57 GMT Distribution: usa Organization: Psych. Neuro-Imaging, Johns Hopkins Lines: 37 Hello database gurus. I am building my first database, and am trying to understand how to balance the benefits of normalization with performance requirements. I am interested in any heuristics or literate on deciding when to scrap normalization to get a performance kick. The database will reside on a DECSystem 5400, and will be built using Ingres (whichever version DEC is supplying run-time libraries for in Ultrix 4.x). We have ample memory on the system for a large database, so wasting some memory with empty domains, for example, really isn't a problem. I am able to create the database in what-I-understand-to-be fourth normal form, and I am familiar with the implementation of Ingres (University Ingres at least). "Rules" regarding trade - offs between relation size and performance, number of relations and performance, subqueries vs. searches through large relations, etc., would be a great help. An example issue I am trying to answer: I have a domain that can have one of a given set of character string values. I am thinking of creating a relation that equates each string with an integer ID. The benefits would be less memory used in the relation with that domain (since an integer would be stored for each tuple rather than an n byte length string), faster comparisons in a qualification (since comparisons on that domain would be between integers and not strings), and less chance of typo's in the database (since could not enter a string that cannot be translated to an ID by a subquery into the ID relation). The cost, of course, is the subquery required whenever you want to access the domain. Does the answer depend on the length of the strings, etc ... Thanks in advance for all the help. Replies by email or to this news group are just fine. James Rosenberg (jamesr@cortex.med.jhu.edu) Div. of Psych. Neuro-Imaging Johns Hopkins University