Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!wuarchive!zaphod.mps.ohio-state.edu!think.com!hsdndev!dartvax!L.Carl.Pedersen From: L.Carl.Pedersen@dartmouth.edu (L. Carl Pedersen) Newsgroups: comp.databases Subject: Re: Relational Databases (and why they don't exist) Message-ID: <1991Jun11.143708.15641@dartvax.dartmouth.edu> Date: 11 Jun 91 14:37:08 GMT References: <1991Jun07.063749.7904@chinet.chi.il.us> <1991Jun08.124337.29813@uvmark.uucp> <2421@ccadfa.adfa.oz.au> <25200@oolong.la.locus.com> Sender: news@dartvax.dartmouth.edu (The News Manager) Organization: Dartmouth College, Hanover, NH Lines: 44 In article <25200@oolong.la.locus.com> jfr@locus.com (Jon Rosen) writes: > In article <2421@ccadfa.adfa.oz.au> ghm@ccadfa.adfa.oz.au (Geoff Miller) writes: (stuff deleted) > >Normalisation does have advantages for more that the database theorist, > >though. Remember that there is a basic law of computing that, if you > >give a user the chance to enter the same piece of information in two > >different places you will inevitably end up with two different versions. > >Normalisation is largely about making sure that each piece of information > >is stored once and once only. > > Not true... normalization is mostly about reducing a data model to its > component parts... what this does is generally eliminate redundant > ATTRIBUTEs, but KEYS are not only repeated, but often repeated > many times... In fact, I have done a comparison of several large > systems that were implemented in RAMIS, a mostly hierarchical 4GL > database and then reimplemented in DB2... The use of a hierarchy was > far more capable of eliminating redundancies than the relational > version... I agree with the Geoff. Normalization is primarily intended to eliminate update anomolies. In fact, the various normal forms all have associated update anomolies. Jon is certainly right, though, that keys end up being repeated all over the place. However, I think a relational purist would argue that keys should be totally meaningless - internally generated surrogate keys, so that you would never have to change them. You would not, for example, use a Social Security Number as a key, since it might turn out to be wrong. An SSN would be just another attribute of a person. Some people actually implement things this way, but, of course, this approach has its own problems. I'm not certain about this, but I think Codd has published a proposal for a relational system that essentially requires the use of surrogate keys, but also manages them more or less automatically, making them a more palatable. And, of course, some of the newer object-oriented systems address this issue by allowing one to reference the object directly rather than a key pointing to the object. I'd be interested in people's experience with using surrogate keys, i.e., internally generated keys that are never revealed to the user, instead of real-world keys like SSN's.