Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uunet!snorkelwacker.mit.edu!bloom-beacon!eru!hagbard!sunic!sics.se!ifi.uio.no!enag From: erik@naggum.no (Erik Naggum) Newsgroups: comp.text.sgml Subject: Re: Short Ref's Reparsed? Message-ID: <82-7640-006-X/0003@naggum.no> Date: 19 Jun 91 14:36:47 GMT References: <448@salt.bellcore.com> Sender: enag@ifi.uio.no (Erik Naggum) Reply-To: Erik Naggum Organization: Naggum Software, Oslo, Norway Lines: 76 Nntp-Posting-Host: gyda.ifi.uio.no In-Reply-To: jxr@thumper.bellcore.com's message of 17 Jun 91 02: 14:40 GMT Originator: enag@gyda.ifi.uio.no Jonathan Rosenberg writes: | | If you specify a short reference map, e.g., | | | | | what happens to the modified text after the substitution is made? Please note a conceptual difference between "substitution" and entity references. When the parser sees an entity references, it invokes an alternate input source from the entity manager, which then delivers characters from this source until it ends and then sends an Entity end signal. After that, the entity manager continues to deliver characters from the entity which contained the entity reference. Short references are mapped to entity references, but not by way of rescanning the input stream. If, however, a document with short references are sent to a system which cannot handle short references, they are replaced by the corresponding general entity reference. Your example does not say what you intended. The short reference mapping declaration refers to an entity "a", but you have defined entity "aa". If you meant and aaa this would map the first, longest occurence of a short reference delimiter ("aa") to a general entity reference to entity "aa", and the letter "a" and then the Ee signal would be parsed, as read from the entity manager. Parsing then sees the following "a", and the Ee intervenes so no recursive mapping can take place. If you meant and aaa the first "aa" would map to the entity a, "aa" would be parsed, and at this point there could be (1) an infinite recursion, or (2) the entity could be read as "aa" by itself. Then, there would be an Entity end, which intervenes between the second "a" of the entity and the third "a" in the data, so no mapping would take place. I don't know for certain whether case (1) or (2) is the right one. If there is a provision in the standard for short reference mapping only in the entity in which the element in which the mapping is declared to be used, or invoked, this would indicate case (2). If there is no such provision, case (1) applies, unless there are other provisions which preclude recursive mappings. Once again, I will have to spend a few hours with the Handbook to answer this question. I would tend to think that defining such a potentially recursive mapping is slightly silly, if not for the sole purpose of showing that a vendor's parser is broken. :-) The difference between "substitution" and "entity reference" is very important. I spent a lot of time fighting this one, until Charles Goldfarb showed me the right direction. An entity reference is really a request to the entity manager for input from an alternate source of input (his terms, very accurate), not a substitution. The parser only reads a stream of characters from the entity manager, and does not rescan previously read input. The entity manager does not have such a capability in the first place. Hope this helps. -- Erik Naggum Professional Programmer +47-2-836-863 Naggum Software Electronic Text 0118 OSLO, NORWAY Computer Communications