Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!sol.ctr.columbia.edu!cica!iuvax!bsu-cs!bsu-ucs!00lhramer From: 00lhramer@bsu-ucs.uucp (Leslie Ramer) Newsgroups: comp.ai Subject: Optical Character Recognition, how? (Curious) Message-ID: <50761@bsu-ucs.uucp> Date: 2 Dec 90 20:33:45 GMT Organization: Ball State University, Muncie, In - Univ. Computing Svc's Lines: 64 I was curious as to the methods of ASCII scanners (OCR's) for translating BITMAP code into ASCII codes. I've generated some thoughts according to the methodology (that can be used) to accomplish this. The problem, according to my reasoning, follows a kind of flow. 1. Some form of logical reduction needs to be made to make the problem more tractable. 2. Some form of mathematical reduction forming a list of statements, or data needs be done. 3. A set of rules or a functional relationship of some kind must exist to make sense of the data. These are elements of the "HOW" of my solution method. Here's the application I would imagine the OCR would use. 1 -> A bitmapped image is bound to have hundreds, if not thousands of dots. It is really quite obvious that all these dots aren't necessary for the recognition of the characters. Would/is a pattern reduction type of method that would lead toward a successful reduction of the problem? 2 -> Once we have the image reduced a little, some mathematics need to reduce the image to another data form. I would imagine that a bunch of vectors need to be generated that best approximate the image. Once this has been accomplished, the vector mathematics that are available could be used to recognize the characters in virtually any rotation. 3 -> The mathematical data created in 2 needs to be interpreted. Some reference vectors (dot products, lengths, approximate locations, etc...) should be matched in some sort of "FUNCTION" table, and the approximate ascii characters output. Maybe even a probability that each character is the correct one, if this method doesn't work. This method is very undeveloped in my mind. I realize that there are other problems in this problem that pose further problems. ie. what is difference between a lowercase L and the number 1? Either context or a direct difference needs to be noted. Again, this is an interesting problem to me. I'm interested in it for the fact that it's interesting that a machine can read as well as, if not faster than, I do. (The blind reader in the library reads (through a speech synth) at a rate of up to 425 wpm. I don't normally read anywhere near that.) It would seem that there are a great number of operations that would need to be done to recognize the characters no matter what typeface is used. I like to see things get done, and get done fast. (I'm very different from the machines in that sense, I don't always think fast.) :-) Thanks in advance, ==== "No one runs so fast as he that is chased." ._ | .-. .-. | \ .-, .-.-. .-. .-. 00LHRAMER@bsu-ucs.bsu.edu | +-' `-, |_/ .-+ | | | +-' | 00LHRAMER@bsu-ucs.UUCP |___ `-' `-' | \ `.| | | | `-' | 00LHRAMER@bsuvax1.bitnet Brought to you by Super Global Mega Corp .com