Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!neat.cs.toronto.edu!marina
From: marina@ai.toronto.edu (Marina Haloulos)
Newsgroups: ont.events
Subject: Dr. Yasushi Nishimura, Tuesday 19 September 1989: SYSTEMS SEMINAR
Message-ID: <89Sep15.135113edt.2200@neat.cs.toronto.edu>
Date: 15 Sep 89 17:52:06 GMT
Lines: 46


                            FLASH ANNOUNCEMENT
             (GB = Gailbraith Building, 35 St. George Street)

       -------------------------------------------------------------

                              SYSTEMS SEMINAR
              GB244, at 2:00 p.m., Tuesday 19 September 1989

                           Dr. Yasushi Nishimura
Artificial Intelligence Department, ATR Communication Systems Research Lab., Japan

                          Document Image Analysis

   We propose a document  image  analysis  method  that  can extract  the
logical structure of scanned paper documents to obtain indices such as
titles, author names, etc.   The  aim of  this  research  is  to  develop a
building block for the next-generation page readers, which will be able to
capture not  only  the  character  codes,  but  also  the layout and
logical structure of documents.

A major problem in document  image  analysis  is  segmenting document
images  to  extract  the components (indices).  We implement the
segmentation process as the  top-down,  model- driven  matching  process
of the model and the input image.  For this purpose, we introduce a  tree
structure  model  to represent  the  layout  of each document type (such as
title pages of IEEE  Trans.  papers).  The  model  also  describes elements
of  the  page  such as the body and running heads.  The  body  is  further
divided  into  the  text  and  other components.

Using the model in a top-down  fashion,  first  the  running heads and
running foots are extracted.  Next, the components other than the text  are
extracted.   The  model  describes these  components  in  the  way they are
segmented, thus the segmentation result of the input image matches that  of
the model.   The  text,  the part where uniformity is frequently degraded,
is extracted as the remaining part of the page.

We also introduce a model  building  process  which  can  be operated by a
novice user.  In an experiment using 115 input documents from 38 types of
scientific  paper  title  pages, every  index  in  85.2  of the input
documents was correctly extracted.  In a comparison  experiment  using  a
bottom-up segmentation  process,  the correct extraction rate was only 9.1.

Applications of the method include  document  entry  without re-keying
for   electronic  publishing  systems,  document retrieval and automatic
indexing of  scanned  documents  for document database systems.