Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!mcsun!news.funet.fi!fuug!sics.se!ifi.uio.no!enag From: erik@naggum.no (Erik Naggum) Newsgroups: comp.text.sgml Subject: Re: Short Ref's Reparsed? Message-ID: <82-7640-006-X/0008@naggum.no> Date: 22 Jun 91 13:40:50 GMT References: <464@salt.bellcore.com> Sender: enag@ifi.uio.no (Erik Naggum) Reply-To: Erik Naggum Organization: Naggum Software, Oslo, Norway Lines: 102 Nntp-Posting-Host: gyda.ifi.uio.no In-Reply-To: jxr@thumper.bellcore.com's message of 19 Jun 91 20: 01:30 GMT Originator: enag@gyda.ifi.uio.no I have consulted The SGML Handbook, and received a note from Goldfarb himself, both of which clearly conclude that in your example case, you would achieve infinite recursion. There is no specific provision in the standard to preclude recursion. If, however, you really wish the replacement text to contain the short reference delimiter(s), you can specify the entity to be character data, as in Jonathan Rosenberg writes: | | Ok. but, what happens in the following case: | | | | | | and | aaa | ??? 1. "aaa" maps to "&aaa;" 2. "&aaa;" produces "aa" 3. "aa" maps to "&aa;" 3. "&aa;" produces "a" so this becomes "a", only. | I think that I found a clause in the standard that outlaws | recursive applications of short reference maps in any case. | Section 9.4.6.1 (page 354 of the Handbook) says (in part): | | "A short reference can be removed from a document by replacing | it with an equivalent reference string that contains a named | entity reference. The entity name must be that to which the | short reference is mapped in the current map." | | This says clearly to me that given the above | aaa | is equivalent to | &aaa;Ee ^^-- not here | which will (eventually) become | aa ^-- but here | and not | a I agree that this looks possible, but the entity referenced (aaa) also needs to map occurrencies of short reference delimiteres ("aa") to entity references (aa) inside it, since it would have been parsed like this had the short references been used. I haven't found any way to inhibit short reference delimiter recognition inside an entity which was referenced by a short reference instead of a general reference; but it seems to be quite necessary: a corollary is that translation from short references to entity references may modify the contents of entities referred to in the replacement text of the entity referenced by the short reference, according as the replacement text contains short reference delimiters, demanding multiple versions of entities according to context. This seems to defeat the purpose of the simple translation, and I can readily foresee problems in this regard. I think benefits may be reaped from conservative use of short reference delimiters in entities thus referenced, or making them data entities with CDATA or SDATA. This is not only true for your particular questions, but in general, since it may not be intuitive when declaring an entity to what short reference delimiters may map where the entity is referenced. E.g., ... ... ... &issue; ... => Vol 5 [libra]6 To which I conclude that it's probably best with CDATA for entities which contain short reference delimiters. In fact, I think CDATA should be used whenever there is no specific need to have the entity parsed, but this may be overly restrictive. Is this a feasible compromise, Jonathan? There is another thing with the USEMAP declaration which I discovered while reading up on this. One can say and , but once that is done, there is no way to restore the map except by knowing which map was specified. The #RESTORE found with USELINK would have been handy. -- Erik Naggum Professional Programmer +47-2-836-863 Naggum Software Electronic Text 0118 OSLO, NORWAY Computer Communications