Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!seismo!sundc!pitstop!sun!decwrl!pyramid!voder!apple!korn From: korn@apple.UUCP (Peter "Arrgh" Korn) Newsgroups: comp.ai,comp.sys.mac,comp.misc Subject: Re: Character recognition Message-ID: <6549@apple.UUCP> Date: Sun, 25-Oct-87 22:07:58 EST Article-I.D.: apple.6549 Posted: Sun Oct 25 22:07:58 1987 Date-Received: Wed, 28-Oct-87 00:48:34 EST References: <641@zen.UUCP> <2984@phri.UUCP> <21433@ucbvax.BERKELEY.EDU> Reply-To: korn@apple.UUCP (Peter "Arrgh" Korn) Followup-To: comp.misc Organization: Apple Computer Lines: 70 Xref: mnetor comp.ai:986 comp.sys.mac:8739 comp.misc:1529 Disclaimer: I wasn't hired to give Apple's opinions; lawyers were In <21433@ucbvax.BERKELEY.EDU>, oster@dewey.soe.berkeley.edu.UUCP (David Phillip Oster) said: >>In article <641@zen.UUCP> vic@zen.UUCP (Victor Gavin) writes: >>> from a scanner image reproduce the original text of the paper in a >>> machine readable form. > >...[discission of the ThunderScan scanner]... > >The expensive scanners are flat bed, copier style machines, and do >their work faster (can't be too much faster, though. It takes >15minutes to send an 8"x10" page at 1-bit per pixel 300dpi, over a >9600 baud line if you do not use a compressing transfer protocol.) If you assume that 9600 baud is the fastest they are transmitting data. The macintosh can accept data over it's serial port at a rate that is quite a bit faster than that (56K baud easily, and appletalk is another 8 times faster than that). Also, most of the newer 'professional' scanners are using the SCSI port, which can get you a full page scanned and transmitted to the Mac's RAM, displayed on the screen eagerly awaiting the deftest commands of the user in as fast as 14 seconds (and perhaps even a second or two faster than that). >Olduvai Software makes a line of software that parses scanned pages >back into text. Either the current issue of MacUser has a review, or I >saw it in a recent copy of MacWeek, but for < $200.00 you get a >software package to do syntactic pattern recognition of letter >features, to determine the ASCII for the scanned page. Unfortunately their advertisements seemed to be a little ahead of their ability to deliver when I spoke with them about a month ago. I recall their saying something about it being at least Christmas before they would actually be shipping product--don't quote me on this last one, as the event happened fully 30 days ago. Nonetheless, after at least two months of advertising in MacUser their product wasn't anywhere near shipping when I called them. >It is still cheaper to hire a human typist, but soon the cost balance >will flip the other way. I hope this happens soon. However, from my experience with character recognition, it won't happen for a little while yet. *If* all that you are scanning is 10 or 12 pitch mono-spaced Courier, Letter Gothic, or one of a small set of other fonts, then computer character recognition is a viable option for you that may well save you a lot of $$ vs. paying a typist to do it. However, to my knowledge, there exists no scanner anywhere that can properly deal with all types of proportional spaced fonts at anything near acceptable accuracy (remember that 99.5% accuracy works out to 3 errors every typewritten page) let alone handle typeset text that is kerned (such as you find in the newspapers and books that you read). Having spend the better part of 6 months selling these beasties, and going to school at a University that had one of the more expensive Kurtzweil machines, I've become somewhat jaded by their promise. They seem to be much like expert systems--very good in a tightly controled environment, but not very good beyond that. >... > >(note, I've directed followups to just comp.misc. If people want to continue >this discussion, they can read it there.) Normally I would have respected this; and all followups to this posting I have redirected to comp.misc, but I felt that there's been enough interest at least in comp.sys.mac to correct some of the statements made about scanning speed and character recognition software in the forum in which it was made. Peter -- Peter "Arrgh" Korn korn@apple.com !hplabs!amdahl!apple!korn "hi mom!"