Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!sol.ctr.columbia.edu!src.honeywell.com!cim-vax.honeywell.com!tdoyle
From: tdoyle@cim-vax.honeywell.com
Newsgroups: comp.databases
Subject: Re: SQL Duplicate Row Deletion ???
Message-ID: <1991Apr1.163615.56@cim-vax.honeywell.com>
Date: 1 Apr 91 16:36:15 CST
References: <91091.141528SYSPMZT@GECRDVM1.BITNET>
Distribution: world
Organization: Honeywell CIS
Lines: 21

In article <91091.141528SYSPMZT@GECRDVM1.BITNET>, SYSPMZT@gecrdvm1.crd.ge.com writes:
> I've made a nice mistake loading data twice over several days into a table, and
> would like to delete just one of the duplicate rows.  The number of duplicates
> is on the order of 10,000 long rows, and the table has 250,000 rows total, so
> unloading, manipulating the data, and reloading the data is out of the
> question.  I can identify the duplicate rows with a select joining the table
> to itself, but I can't use the same technique to delete the rows.
> 
> Anyone have a nifty solution to this problem?  The database manager is DB2, but
> I'd think that any SQL based language would have the same problem.
> 
> Thanks in advance,
> 
> Phil Zampino

Select identified dupicates into a temporary table (SELECT UNIQUE INTO TEMP).
Then delete from the original where the record is the same as in TEMP.
This will rid ALL instances of the duplicates (including the original).
Then add records from TEMP into ORIG to restore one copy.

Bipin Prasad bprasad@honcim1.honeywell.com