Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!wuarchive!uwm.edu!src.honeywell.com!cim-vax.honeywell.com!tdoyle
From: tdoyle@cim-vax.honeywell.com
Newsgroups: comp.databases
Subject: Re: SQL Duplicate Row Deletion ???
Message-ID: <1991Apr3.160345.58@cim-vax.honeywell.com>
Date: 3 Apr 91 22:03:45 GMT
References: <91091.141528SYSPMZT@GECRDVM1.BITNET><1991Apr3.010838.2063@eng.ufl.edu> <DRACK.91Apr3083024@diablo.titan.tsd.arlut.utexas.edu>
Organization: Honeywell CIS
Lines: 21

In article <DRACK.91Apr3083024@diablo.titan.tsd.arlut.utexas.edu>, drack@titan.tsd.arlut.utexas.edu (Dave Rackley) writes:
> 
> I cannot justify the need for them, but in real-time data collection duplicates
> are often created by multiple sensors at a single source.  It then becomes a
> data reduction/analysis issue to remove duplicates, while recording the fact
> that duplicate data captures occurred.
> 
> Why let this happen?  Our data is always dirty--we have to clean it up before
> the customer gets his hands on it!

This may be a little esotoric, but I would suggest that the table is not
in third normal form. If the table represents data measurements from various
sensors, this would indicate that one attribute is missing from the table:
i.e. sensor-id. If the table represents simply the sequential measurement
then a date-time or a counter attribute may be required.

Once the database "correctly" models the world, then it is quite
possible, (when the user-view of the object does not include one of the
primary identifiers), that the tuple presented in the view corresponds to
more than one tuple in the original relation. But, this does not excuse the
original deficiency the relational DBMSs.