Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!uwm.edu!linac!att!att!cbnewsk!noraa From: noraa@cbnewsk.att.com (aaron.l.hoffmeyer) Newsgroups: comp.editors Subject: Re: repeated character editing Keywords: vi sed Message-ID: <1991Apr1.095425.26413@cbnewsk.att.com> Date: 1 Apr 91 09:54:25 GMT References: <1991Mar28.181335.7813@cbnewsm.att.com> <1991Mar30.024851.24414@jarvis.csri.toronto.edu> Distribution: usa Organization: AT&T Bell Laboratories Lines: 43 In article <1991Mar30.024851.24414@jarvis.csri.toronto.edu> ruhtra@turing.toronto.edu (Arthur Tateishi) writes: >In article <1991Mar28.181335.7813@cbnewsm.att.com> cadman@cbnewsm.att.com (jerome.schwartz) writes: >> >>How would I do a simple edit command, vi map or sed script, to >>replace all occurrences of repeated characters in a file with one >>of each of the characters. >> >>ie: This: AAAAPPPPLLLLEEEE ppppiiiieeee >> to this: APPLE pie > >If your example was a typo and should have been: APLE pie >then the following reg-exp will work. >:s/\(.\)\1*/\1/g > >-- >Red Alert. > -- Q, "Deja Q", stardate 43539.1 >Arthur Tateishi g9ruhtra@zero.cdf.utoronto.edu I've seen the original question asked several times in this news.group in just tha last six months. Yes, there are many solutions to this problem, using tr, sed, reg exps, awk, perl etc. etc. etc. But some of the responses and even the people asking the question ignore the situation that creates the problem. The ONLY time I have ever seen 4 characters in place of one character is when someone directs nroff output to a file, then searches for the literal backspace characters (and underscores, if present) and replaces them with nothing. nroff uses the trick of overstriking to embolden words (or underline them). So, the simplest solution to this problem is to not create it in the first place. If you filter the nroff output through col (which is a standard command in UNIX, I think - maybe it is just system V) with the -b option, then you get plain ASCII output that does not have backspaces (or underscores and backspaces) and multiple occurences of characters. I can't recall if there is an "Often Asked Questions" posting in this newsgroup, but if there is, I am sure this solution is in there. If there isn't such a posting for this group, maybe there should be and this question should be included. Aaron L. Hoffmeyer TR@CBNEA.ATT.COM