Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!att!tut.cis.ohio-state.edu!ucbvax!mtxinu!sybase!vamp!ben
From: ben@vamp.sybase.com (ben ullrich)
Newsgroups: comp.databases
Subject: Re: Duplicated tuples problem (Oracle/SQL)
Message-ID: <12224@sybase.sybase.com>
Date: 30 Dec 90 00:48:05 GMT
References: <772@keele.keele.ac.uk> <CIMSHOP!DAVIDM.90Dec21095239@uunet.UU.NET>
Sender: news@Sybase.COM
Organization: sybase, inc., emeryville, ca
Lines: 64

In article <CIMSHOP!DAVIDM.90Dec21095239@uunet.UU.NET> cimshop!davidm@uunet.UU.NET (David S. Masterson) writes:
>>>>>> On 21 Dec 90 10:13:14 GMT, csa18@seq1.keele.ac.uk (R.J. Husmo) said:
>
>R.J.> Q: Is there a way to automatically delete duplicate tuples from a
>R.J.> table/all tables in a db using SQL?
>
>The typical way is to put a unique index on the primary key of each table.

i don't think this answers his/her question.  it sounds to me like s/he has
duplicates in his/her tables and needs to get rid of them.  trying to add
a unique index won't do it, since the duplicates have to be gone before the
index may be successfully built.  i agree that using a unique index on a key in
any given table is the best way to go, as it then also conforms to normal form
rules.

in short, there is no *automatic* method to remove duplicates, short of
programming your database engine to silently reject duplicates as they are
inserted, or by letting a unique index raise an exception in the same
instance.  you can, however, run queries against your database to expose and/or
clean out the duplicates as you see fit.

the following article was posted just about a year ago, when this question was
last asked in this forum.  i hope mr. may doesn't mind its reposting without his
permission.

|>Article 3144 of comp.databases:
|>Path: sybase!mtxinu!ucbvax!tut.cis.ohio-state.edu!purdue!decwrl!shlump.nac.dec.com!arkham.enet.dec.com!nmeser!may
|>From: may@28182.dec.com (Patrick May)
|>Newsgroups: comp.databases
|>Subject: Re: SQL query for duplicates
|>Message-ID: <36@arkham.enet.dec.com>
|>Date: 22 Dec 89 17:16:58 GMT
|>Sender: news@arkham.enet.dec.com
|>Organization: Digital Equipment Corporation
|>
|>     I can think of two ways to remove duplicates from an Oracle database
|>table.  The first will work with any SQL-based database:
|>
|>          1.  Create a temporary table with the same structure as the
|>              one containing the duplicates.
|>
|>          2.  INSERT INTO new_tab
|>              SELECT DISTINCT *
|>              FROM dup_tab;
|>
|>          3.  Delete all records from dup_tab and refill it from new_tab.
|>
|>     This works if entire rows are duplicated.  If you just want to find
|>duplicate values in a particular field, the following should be helpful:
|>
|>          SELECT DISTINCT col_name, count(*)
|>          FROM dup_tab
|>          GROUP BY col_name
|>          HAVING count(*) > 1;
|>
|>     This will display all duplicate values for col_name and the number of
|>times that value appears in dup_tab.


..ben
------
ben ullrich			 only i do the talking here -- not my employer.
ben@sybase.com			       {pyramid,pacbell,sun,lll-tis}!sybase!ben
  "why waste time learning when ignorance is instantaneous?"-hobbes on calvin