Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!neat.cs.toronto.edu!marina From: marina@ai.toronto.edu (Marina Haloulos) Newsgroups: ont.events Subject: Dr. Yasushi Nishimura, Tuesday 19 September 1989: SYSTEMS SEMINAR Message-ID: <89Sep15.135113edt.2200@neat.cs.toronto.edu> Date: 15 Sep 89 17:52:06 GMT Lines: 46 FLASH ANNOUNCEMENT (GB = Gailbraith Building, 35 St. George Street) ------------------------------------------------------------- SYSTEMS SEMINAR GB244, at 2:00 p.m., Tuesday 19 September 1989 Dr. Yasushi Nishimura Artificial Intelligence Department, ATR Communication Systems Research Lab., Japan Document Image Analysis We propose a document image analysis method that can extract the logical structure of scanned paper documents to obtain indices such as titles, author names, etc. The aim of this research is to develop a building block for the next-generation page readers, which will be able to capture not only the character codes, but also the layout and logical structure of documents. A major problem in document image analysis is segmenting document images to extract the components (indices). We implement the segmentation process as the top-down, model- driven matching process of the model and the input image. For this purpose, we introduce a tree structure model to represent the layout of each document type (such as title pages of IEEE Trans. papers). The model also describes elements of the page such as the body and running heads. The body is further divided into the text and other components. Using the model in a top-down fashion, first the running heads and running foots are extracted. Next, the components other than the text are extracted. The model describes these components in the way they are segmented, thus the segmentation result of the input image matches that of the model. The text, the part where uniformity is frequently degraded, is extracted as the remaining part of the page. We also introduce a model building process which can be operated by a novice user. In an experiment using 115 input documents from 38 types of scientific paper title pages, every index in 85.2 of the input documents was correctly extracted. In a comparison experiment using a bottom-up segmentation process, the correct extraction rate was only 9.1. Applications of the method include document entry without re-keying for electronic publishing systems, document retrieval and automatic indexing of scanned documents for document database systems.