Xref: utzoo news.groups:22366 comp.text:7002 Path: utzoo!mnetor!geac!torsqnt!news-server.csri.toronto.edu!math.lsa.umich.edu!math.lsa.umich.edu!emv From: emv@math.lsa.umich.edu (Edward Vielmetti) Newsgroups: news.groups,comp.text Subject: Re: call for discussion: comp.text.sgml Message-ID: Date: 11 Jul 90 06:57:25 GMT References: <2498@loria.crin.fr> Sender: usenet@math.lsa.umich.edu Followup-To: news.groups,comp.text Organization: University of Michigan Math Dept., Ann Arbor MI. Lines: 45 In-Reply-To: chenevoy@loria.crin.fr's message of 7 Jul 90 07:20:01 GMT In article <2498@loria.crin.fr> chenevoy@loria.crin.fr (CHENEVOY) writes: I am also interrested with document structures. We are developping here a knowledge-based system for structured document recognition. The system should deal with models of documents represented with standard normas. Can you give a concrete example ? The picture that I am envisioning is the problem of recapturing the information in say a railway timetable, or marks on a printed form, or distinguishing amongst a set of similar printed forms. The problems of document recognition are not quite the same as document composition, assuming that the physical structure is more important (because we have the layout as an input), therefore, SGML is not necessary the best standard for us. We are also interrested with ODA, ODL and all aspects of document structure. I would hope that whatever group would come about would have a wide enough charter that no one would be obligated to assert that SGML was the best standard and should be applied to every problem. For whatever reason, it appears to be the underpinning or at least the style in which several packages I am familiar with are organized. Are ODA and ODL available from standards organizations? What is special about them that makes them more suited for your task? > comp.text.sgml. seems reasonable. Why not comp.text.struct, or better comp.doc.struct ? I suppose I do show a bias here -- the class of problems that interest me, or that I see myself facing, include texts which on the surface do not appear to have much structure at all. Part of the challenge will be marking them so that there is some information recovered, or so that a human going over them later with a browser can make sense of them. SGML is a convenient enough tag-word to latch on to a proper set of interested people; other ways of describing the group, though perhaps more appropriate for a given task, don't seem to be likely to draw the proper set of people together. --Ed Edward Vielmetti, U of Michigan math dept comp.archives moderator