Path: utzoo!utgpu!news-server.csri.toronto.edu!mailrus!uunet!snorkelwacker!mit-eddie!uw-beaver!ssc-vax!bcsaic!paula From: paula@bcsaic.UUCP (Paul Allen) Newsgroups: comp.sys.ibm.pc Subject: cheap text scanners Keywords: 400dpi ocr scanner text scanman Message-ID: <25780@bcsaic.UUCP> Date: 5 Jun 90 02:37:49 GMT Distribution: na Organization: Boeing Computer Services ATC, Seattle Lines: 48 I've been looking for an inexpensive way to get stock and mutual fund data from the newspaper into my computer. A hand-scanner/OCR software package recently went on sale at our local Egghead Software store, so I rushed on down to pick one up. The scanner was the Logitech Scanman Plus. The OCR software was also from Logitech and included "Omni-Font technology" said to enable it to read any font. To cut to the chase, the guys at Egghead told me that it wasn't possible to reliably scan newspaper text with a 400dpi scanner and that I'd probably be unhappy with the results. I ended up leaving the store empty-handed. (But let's hear it for the up-front candor of the store personnel, hey? I was pre-sold, and they talked me out of it! :-) ) Now, I realize that hand scanners are primarily used for images, rather than text. But I've been hearing about character recognition software for a couple years now. Surely it hasn't all been hype? (The Egghead guy pointed to the 20-point type on the scanner box and said it would read that just fine!) A back-of-the-envelope calculation shows that a 7-point numeral scanned at 400dpi will stand ~39 pixels high. That would seem like adequate resolution for quite accurate recognition. Lower-case alpha would be a bit harder at ~20 vertical pixels. Seems within the realm of possibility to me, but I haven't tried to actually write OCR software. :-) Is anybody out there actually scanning newspaper text with a hand-scanner? How about with a flat-bed scanner? What software are you using? Which scanner? What sort of error rate do you consider acceptable? Do you have trouble with specs of dust being recognized as characters? If I tried to scan a full printed page out of the Investors Daily, how many days of compute time would it take on a 386/25? :-) If you have some useful intelligence to share, please reply by email to the address below. While I would love to be able to read comp.sys.ibm.pc regularly, I simply lack the time. Since this is probably of interest to more than just myself, I'll summarize whatever I learn. Thanks, Paul Allen -- ------------------------------------------------------------------------ Paul L. Allen | pallen@atc.boeing.com Boeing Advanced Technology Center | ...!uw-beaver!bcsaic!pallen