Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.2 9/17/84; site mhuxd.UUCP
Path: utzoo!decvax!bellcore!petrus!sabre!zeta!epsilon!gamma!ulysses!mhuxr!mhuxd!wolit
From: wolit@mhuxd.UUCP (Jan Wolitzky)
Newsgroups: net.unix
Subject: Re: Unique Word Counter Needed
Message-ID: <3699@mhuxd.UUCP>
Date: Tue, 10-Dec-85 22:40:17 EST
Article-I.D.: mhuxd.3699
Posted: Tue Dec 10 22:40:17 1985
Date-Received: Thu, 12-Dec-85 00:34:42 EST
References: <232@ihlpf.UUCP>
Distribution: na
Organization: AT&T Bell Laboratories, Murray Hill
Lines: 19

> I need a way to count unique words in a document.
> Does any one have suggestions on a simple way to do this?

Try:

deroff -w filename | dd conv=lcase 2>/dev/null | sort -u | wc -l

"deroff -w" breaks the file up into single words, one per line.
"dd" converts everything to lower case (so "word" and "Word" count as
    the same thing). ("dd" is verbose, so I redirect stderr.)
"sort -u" keeps just one copy of each line.
"wc -l" counts the lines.

If you're going to run this frequently, stick it in a file, make it
executable, replace "filename" with "$*" so you can pass it file names
as arguments, and you're off.
-- 
Jan Wolitzky, AT&T Bell Labs, Murray Hill, NJ; 201 582-2998; mhuxd!wolit
(Affiliation given for identification purposes only)