Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site ut-dillo.UUCP Path: utzoo!decvax!decwrl!pyramid!ut-sally!ut-ngp!ut-dillo!darin From: darin@ut-dillo.UUCP (Darin Adler) Newsgroups: net.unix Subject: Re: Unique Word Counter Needed Message-ID: <249@ut-dillo.UUCP> Date: Fri, 13-Dec-85 03:26:05 EST Article-I.D.: ut-dillo.249 Posted: Fri Dec 13 03:26:05 1985 Date-Received: Sat, 14-Dec-85 01:14:40 EST References: <232@ihlpf.UUCP> Distribution: na Organization: UTexas Computation Center, Austin, Texas Lines: 14 <> Here is the method I normally use to count words: tr A-Z a-z | tr -cs -a-z0-9\'\" '\012' | sort -u | wc -l The first "tr" command take care of capitalization. The second "tr" command separates the file into a word per line (where a word is a sequence of characters [-A-Za-z0-9'"]). The "sort" command eliminates duplicates and the "wc" gives us the number of lines in the result. -- Darin Adler {gatech,harvard,ihnp4,seismo}!ut-sally!ut-dillo!darin "Such a mass of motion -- do not know where it goes" P. Gabriel