Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!cs.utexas.edu!samsung!usc!henry.jpl.nasa.gov!elroy.jpl.nasa.gov!jpl-devvax!lwall From: lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) Newsgroups: comp.lang.perl Subject: Re: need perl help Message-ID: <6723@jpl-devvax.JPL.NASA.GOV> Date: 4 Jan 90 02:12:13 GMT References: <229@carssdf.UUCP> Reply-To: lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) Organization: Jet Propulsion Laboratory, Pasadena, CA Lines: 40 In article <229@carssdf.UUCP> usenet@carssdf.UUCP (UseNet Id.) writes: : I would like to remove pairs of a letter from a string. After I remove : spaces & vowels, something like this: : $a =~ tr/AEIOU/ /; : $a =~ s/ //og; (The o is unnecessary.) : I then would like to remove double letters something like : wizzard --> wizard : This all goes toward building a key to compare names, addresses, etc... to : eliminate duplicates. : : Does anyone have any ideas? There's probably a more elegant way to remove : the vowels and spaces too, for that matter. Yes, use the [] construct and say s/[AEIOU ]//g; or some such. There are several ways to remove duplicate characters, but the most concise (and probably the fastest) is to say $a =~ s/(.)\1/$1/g; This does have the problem that it doesn't reduce three in a row, but while ($a =~ s/(.)\1/$1/g) {} will fix that. You ought to be able to say $a =~ s/(.)\1+/$1/g; but you'll get a complaint about "regexp *+ operand could be empty". Now that I think on it, you can say $a =~ s/(.)\1\1?\1?/$1/g; which will translate up to 4 duplicate chars. How thorough do you want to get? Larry