Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/17/84; site bdaemon.UUCP Path: utzoo!decvax!bellcore!petrus!scherzo!allegra!mit-eddie!think!harvard!seismo!hao!nbires!bdaemon!carl From: carl@bdaemon.UUCP (carl) Newsgroups: net.unix Subject: Re: Unique Word Counter Needed Message-ID: <340@bdaemon.UUCP> Date: Fri, 13-Dec-85 11:21:36 EST Article-I.D.: bdaemon.340 Posted: Fri Dec 13 11:21:36 1985 Date-Received: Wed, 18-Dec-85 00:43:32 EST References: <232@ihlpf.UUCP> Distribution: na Organization: Daemon Assoc., Boulder, CO Lines: 28 > > I need a way to count unique words in a document. > Does any one have suggestions on a simple way to do this? The following is a fancy version of what you want. NOTE: The precise syntax of 'tr' varies among versions, so some diddling may be needed. Good Luck! ------------------------------------------------------------ cat $* | # tr reads the standard input tr "[A-Z]" "[a-z]" | # Convert all upper case to lower case tr -cs "[a-z]\'" "\012" | # Replace all characters not a-z to # a new line. i.e. one word per line sort | # uniq expects sorted input uniq -c | # Count the number of times each word appears sort +0nr +1d | # Sort first from most to least frequent, # then alphabetically. pr -w80 -4 -h "Concordance for $*" # Print in four columns ------------------------------------------------------------ Carl Brandauer daemon associates, Inc. 1760 Sunset Boulevard Boulder, CO 80302 303-442-1731 {allegra|amd|attunix|cbosgd|ucbvax|ut-sally}!nbires!bdaemon!carl