Path: utzoo!attcan!uunet!husc6!bloom-beacon!gatech!purdue!tut.cis.ohio-state.edu!ucbvax!hplabs!hp-pcd!hplsla!davidr
From: davidr@hplsla.HP.COM (David M. Reed)
Newsgroups: comp.sys.ibm.pc
Subject: Re: Scanning, Optical Character Recognition
Message-ID: <5190013@hplsla.HP.COM>
Date: 18 May 89 22:27:08 GMT
References: <5012@pt.cs.cmu.edu>
Organization: HP Lake Stevens, WA
Lines: 14

I have not yet got to work with an expensive system (such as the kind that
come with a dedicated card and cost $3000, like TrueScan), but I have been
able to use some inexpensive (<$500) software based OCR programs.  From that
I would say that my number 1 criteria is for accuracy to be greater than 99%,
and secondly to be capable of reading kerned print (preferrable even to have
the program prompt for translation when it comes across an unrecognized
character such as a double f).  Most of the inexpensive programs seem to be
limited to typewritten and dot-matrix fixed-space print, thus eliminating
what seems to be 98% of what I want to copy (books, magazines, newspapers,
LaserJet proportional font output, legal documents, etc.)  And 99% accuracy
rate will still require you to carefully read what has been translated from
image to character, for that means that at least 1 letter out of 100 is
probably incorrect.  I frequently type 70+ characters per line, so that means
I will probably have at least 2 incorrect characters in three lines of text!