Path: utzoo!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!usc!sdd.hp.com!spool.mu.edu!news.cs.indiana.edu!ariel.unm.edu!triton.unm.edu!ee5391aa From: ee5391aa@triton.unm.edu (Duke McMullan n5gax) Newsgroups: comp.sys.ibm.pc.hardware Subject: Re: Cheap Scanners and OCR's Message-ID: <1991Mar18.165909.15857@ariel.unm.edu> Date: 18 Mar 91 16:59:09 GMT References: <40269@cup.portal.com> Distribution: usa Organization: University of New Mexico, Albuquerque Lines: 39 In article <40269@cup.portal.com> David_Dave_Tamashiro@cup.portal.com writes: >Does anyone know if the cheap scanners and OCR's selling for less than >$200 is any good? (Ex. Marstek 400 dpi scanner w/ OCR ~$170). >It would be really neat if it could read program listings straight >from magazines. Am I hoping for too much?? Do these things really >work? Dave, they probably _will_ read source straight from magazines. The problem is, they make errors, and you...yeah, that means YOU, Dave...have to correct those errors by hand. I have a Mars 105 scanner, and I expect the OCR (ORC?) software in next week, so perhapsably I'll post a review here if I'm either delighted or pissed off. A _good_ OCR package will include an integral editor which will let you make decisions which the software decides are iffy, on the fly. One of inter- mediate quality will mark such points with some sort of mark, so you can search for them with YFeditor. I don't know how well the stuff I've ordered will work; it's not likely to be leading-edge stuff. Hopefully, it's not trailing-edge, either. ;^) There's no question that the scanner reads the text well, especially at 400 dpi. It remains to be seen how well the software decodes it. Are you hoping for too much? Probably, but then, so am I. The real nuisance are the things of which the OCR is certain, but wrong. They don't get flagged. `Course, a compiler has a good chance of catching (gag- ging on) a lot of them...but not all. I suspect, down the line, magazines may (I say _may_) start printing source in a typeface which is somewhat optimized for OCR. OTOH, they might decide to use a typeface optimized for OCR _failure_, so you have to buy their disks. It's a very young market; we'll have to wait to see what falls out of it. d