Xref: utzoo comp.unix.questions:25847 comp.unix.shell:347 Path: utzoo!attcan!uunet!aplcen!uakari.primate.wisc.edu!zaphod.mps.ohio-state.edu!mips!wyse!bob From: bob@wyse.wyse.com (Bob McGowen x4312 dept208) Newsgroups: comp.unix.questions,comp.unix.shell Subject: Re: Counting characters with unix utilities Message-ID: <2986@wyse.wyse.com> Date: 29 Sep 90 01:05:50 GMT References: <4002@umbc3.UMBC.EDU> <939@hls0.hls.oz> <1990Sep28.173033.292@msi.umn.edu> Sender: news@wyse.wyse.com Reply-To: bob@wyse.UUCP (Bob McGowen x4312 dept208) Followup-To: comp.unix.questions Organization: Wyse Technology Lines: 70 In article <1990Sep28.173033.292@msi.umn.edu> haberman@msi.umn.edu (Joe Habermann) writes: >george@hls0.hls.oz (George Turczynski) writes: > ... Deleted examples of awk scripts. ... Original postings on this topic used tr and wc. Following that line I decided to try my hand at a script for counting characters. In the meantime things seem to have moved away from the "simple" solutions into more esoteric (still interesting) ways to solve the problem. Never the less I will present my script for commnent and feed back. The basic design is to take advantage of the tr commands use of regular expressions and provide a tool that will allow the user to count the set of characters named or their inverse. So: chrcnt abc file chrcnt -n abc file will count all occurances of the letters a, b and c followed by a count of all characters that are not a, b or c. This will work with white space as well and handles cases where there are no matches. The use of cat allows you to specify one or more files on the command line or have the script read its standard input. One final note is that if you should want to look for dashes and n's, use n- as the pattern (or --n, if you want). ------------script follows---------- #!/bin/sh case $# in 0) # the following is because cmd aliasing can produce absolute paths CMD=`basename $0` echo "$CMD: usage: $CMD [-n] reg_expression [files...]\n"\ "\twhere -n means not the following pattern characters." >&2 exit 1 ;; 1) # if only one arg it must be the pattern TR_ARGS=-cd pattern="$1" ;; *) # all other cases may or may not have -n as the first arg case $1 in -n) TR_ARGS=-d pattern="$2" shift;shift files="$*" # if only two args, files is null ;; *) TR_ARGS=-cd pattern="$1" shift files="$*" ;; esac ;; esac cat $files | tr $TR_ARGS "$pattern" | wc -c Bob McGowan (standard disclaimer, these are my own ...) Product Support, Wyse Technology, San Jose, CA ..!uunet!wyse!bob bob@wyse.com