Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!usc!ucsd!ucbvax!agate!usenet
From: c60b-3ac@web.berkeley.edu (Eric Thompson)
Newsgroups: comp.unix.questions
Subject: Need help ** removing duplicate rows **
Message-ID: <1990Oct30.234654.23547@agate.berkeley.edu>
Date: 30 Oct 90 23:46:54 GMT
Sender: usenet@agate.berkeley.edu (USENET Administrator)
Organization: University of California, Berkeley
Lines: 23

I have a few very long files that contain rows of ASCII data.  Each row
looks something like this (not the actual data here):

a:A:b:c:d:e:f:g:h:i:j:k:l:m
a:B:b:c:d:e:f:g:h:i:j:k:l:m
a:C:b:c:d:e:f:g:h:i:j:k:l:m
a:D:b:c:d:e:f:g:h:i:j:k:l:m
b:A:n:o:p:q:s:t:u:v:w:x:y:z
c:A:x:a:x:b:x:c:d:a:m:l:v:x
d:A:m:l:k:j:i:h:g:f:e:d:c:b
d:B:m:l:k:j:i:h:g:f:e:d:c:b
d:C:m:l:k:j:i:h:g:f:e:d:c:b

It's the second column that's important.  If there are multiple rows that
are exactly the same except for the second column, I want to GET RID of them.
If the row is unique (for example, the ones starting with "b" and "c" above)
then it should stay.  Sounds like what I need is a way to filter out rows
that are duplicate except in the second column.

Any hints?  I'll take anything, really.  Please MAIL your replies, since I
doubt this is of general interest.  Thanks again.

Eric Thompson    c60b-3ac@web.berkeley.edu  ...!ucbvax!web!c60b-3ac