Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sun-barr!newstop!texsun!convex!convex.COM From: tchrist@convex.COM (Tom Christiansen) Newsgroups: comp.lang.perl Subject: Re: uniq'ing arrays Message-ID: <109104@convex.convex.com> Date: 21 Nov 90 15:59:29 GMT References: <1990Nov21.021344.16038@fxgrp.fx.com> Sender: news@convex.com Reply-To: tchrist@convex.COM (Tom Christiansen) Organization: CONVEX Software Development, Richardson, TX Lines: 49 In article <1990Nov21.021344.16038@fxgrp.fx.com> grady@postgres.berkeley.edu writes: =grep is a useful unix utility that has a parallel perl operator. =What about uniq? I'd like to see uniq (or something) that strips =out the duplicates in an array. It would be cool if it were =built into perl. =Meanwhile, has anyone written an efficient routine to do this? =Linear time would be nice. I'd like one that doesn't require the =array to be sorted. I could write one myself, but I thought =I'd avoid duplicating code.. This is turning into a FAQ, isn't it? In <9952@jpl-devvax.JPL.NASA.GOV> on 12 Oct 90, Larry wrote a good article on this. He covers these case: 1) If @in is sorted: $prev = 'nonesuch'; @out = grep($_ ne $prev && (($prev) = $_), @in); 2) If we don't know whether @in is sorted: undef %saw; @out = grep(!$saw{$_}++, @in); 3) If we don't know if @in is sorted, nor case whether @out is: undef %ary; @ary{@in} = (); @out = keys(%ary); (I usually use #3.) And then he points out that if you know that @in contains only small positive integers, you can use: @out = grep(!$saw[$_]++, @in); for case 2, and for case 3: @ary[@in] = @in; @out = sort @ary; I guess I'll add it to the FAQ. --tom