Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!tut.cis.ohio-state.edu!ucbvax!bloom-beacon!eru!hagbard!sunic!mcsun!ukc!keele!csa18 From: csa18@seq1.keele.ac.uk (R.J. Husmo) Newsgroups: comp.databases Subject: Re: Duplicated tuples problem (Oracle/SQL) Message-ID: <775@keele.keele.ac.uk> Date: 2 Jan 91 12:14:14 GMT References: <772@keele.keele.ac.uk> <12224@sybase.sybase.com> Reply-To: csa18@seq1.kl.ac.uk (R.J. Husmo) Organization: /usr/local/lib/news/organization Lines: 53 In article <12224@sybase.sybase.com> you write: >In article cimshop!davidm@uunet.UU.NET (David S. Masterson) writes: >>>>>>> On 21 Dec 90 10:13:14 GMT, csa18@seq1.keele.ac.uk (R.J. Husmo) said: >> >>R.J.> Q: Is there a way to automatically delete duplicate tuples from a >>R.J.> table/all tables in a db using SQL? >> >>The typical way is to put a unique index on the primary key of each table. > >i don't think this answers his/her question. it sounds to me like s/he has >duplicates in his/her tables and needs to get rid of them. [...] Indeed. Spot on. Let me describe the scenario, as there may be a clever way of solving a specific problem which does not have a general solution. Our DBA, who has only been a DBA for a few months, was given the task of enlarging the storage space for our DB. Having looked in the DB manuals, he decided that the easiest/safest way to do this would be to export all the tables and their associated data, and then replace the old DB with a new and larger version. When setting up the new DB, he included all the system tables and data. He then imported the old tables and data. This meant that some of the system tuples were loaded twice. So when doing, for example, a 'help' each help topic appears twice. At this point our DBA asked for help. He spent three days going through the process above, and does not wish to do so again. So I posted the query here, as it seemed like an interesting problem. We are using ORACLE RDBMS V6.0.27.8.2, by the way. >|> >|> 1. Create a temporary table with the same structure as the >|> one containing the duplicates. >|> >|> 2. INSERT INTO new_tab >|> SELECT DISTINCT * >|> FROM dup_tab; >|> >|> 3. Delete all records from dup_tab and refill it from new_tab. >|> We thought of doing this, but there are some problems: We do not know the format of the tables involved Nor do we know the name of all the tables involved, although I suspect that we can quite easily find out There are rather a lot of tables involved (61 candidates) We thought of some other ways of doing it, too, some of which included some rather creative shell script writing, but none seemed to automate the process sufficiently (As you pointed out in your posting). The more I think about it, the more I think I'll convince our DBA to put off writing his thesis and do it again. Properly, this time. Reidar.