Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!thunder.mcrcim.mcgill.edu!snorkelwacker.mit.edu!usc!zaphod.mps.ohio-state.edu!mips!spool.mu.edu!uunet!mcsun!news.funet.fi!fuug!sics.se!ifi.uio.no!enag From: enag@ifi.uio.no (Erik Naggum) Newsgroups: comp.text.sgml Subject: Re: Short Ref's Reparsed? Message-ID: <82-7640-006-X/0004@naggum.no> Date: 19 Jun 91 15:15:36 GMT References: <448@salt.bellcore.com> <9106172304.AA05241@ucbvax.Berkeley.EDU> Sender: enag@ifi.uio.no (Erik Naggum) Followup-To: comp.text.sgml Organization: Naggum Software, Oslo, Norway Lines: 36 Nntp-Posting-Host: gyda.ifi.uio.no In-Reply-To: gtoal@tardis.computer-science.edinburgh.ac.uk's message of 17 Jun 91 22: 04:51 GMT Originator: enag@gyda.ifi.uio.no gtoal@tardis.computer-science.edinburgh.ac.uk writes: | | I've hacked my lex parser to push back the expansion of an entity | ref onto the incoming text stream, so it is reparsed. A cheap technique | but works OK. Means you can have infinite recursion though if you're | not careful... Remember that the only way to get "<" into a document, when followed by a valid name start character (a-zA-Z) is to define an entity which expands to "<", e.g. . If you place an "entity end" signal at the end of the expansion, it would appear to be right, except in the cases where the entity refereces an external entity. lex doesn't quite cut it. Better to write an entity manager, and provide an interface with the following three primitives: -- define entity -- invoke entity &name; %name; -- read character after syntax checking, the ENTITY markup declaration could be pushed to the entity manger's define entity function. Whenever a syntac- tically legal entity reference is parsed or named, the requestor (parser or application) calls the entity manager's invoke entity function and reads characters until Entity end ocurs. Putting both entity manager and parser in the same control path is probably a bad idea due to their conceptual independence, and the fact that even the application may need to reference entities. -- Erik Naggum Professional Programmer +47-2-836-863 Naggum Software Electronic Text 0118 OSLO, NORWAY Computer Communications