Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!zaphod.mps.ohio-state.edu!mips!news.cs.indiana.edu!att!bellcore!salt!jxr From: jxr@thumper.bellcore.com (Jonathan Rosenberg) Newsgroups: comp.text.sgml Subject: Re: Short Ref's Reparsed? Message-ID: <489@salt.bellcore.com> Date: 24 Jun 91 16:47:29 GMT Sender: news@salt.bellcore.com Lines: 179 > I have consulted The SGML Handbook, and received a note from Goldfarb > himself, both of which clearly conclude that in your example case, you > would achieve infinite recursion. There is no specific provision in > the standard to preclude recursion. > If, however, you really wish the replacement text to contain the short > reference delimiter(s), you can specify the entity to be character > data, as in > > Jonathan Rosenberg writes: >| >| Ok. but, what happens in the following case: >| >| >| >| >| >| and >| aaa >| ??? > 1. "aaa" maps to "&aaa;" > 2. "&aaa;" produces "aa" > 3. "aa" maps to "&aa;" > 3. "&aa;" produces "a" [Oops, I guess that should be 4.] > so this becomes "a", only. That's what I was afraid of. >| I think that I found a clause in the standard that outlaws >| recursive applications of short reference maps in any case. >| Section 9.4.6.1 (page 354 of the Handbook) says (in part): >| "A short reference can be removed from a document by replacing >| it with an equivalent reference string that contains a named >| entity reference. The entity name must be that to which the >| short reference is mapped in the current map." >| This says clearly to me that given the above >| aaa >| is equivalent to >| &aaa;Ee > ^^-- not here >| which will (eventually) become >| aa > ^-- but here >| and not >| a > I agree that this looks possible, but the entity referenced (aaa) also > needs to map occurrencies of short reference delimiteres ("aa") to > entity references (aa) inside it, since it would have been parsed like > this had the short references been used. Yeah, that makes sense (unfortunately). > I haven't found any way to inhibit short reference delimiter recognition inside an > entity which was referenced by a short reference instead of a general reference; > but it seems to be quite necessary: It certainly does to me, too. > a corollary is that translation from short references to entity > references may modify the contents of entities referred to in the > replacement text of the entity referenced by the short reference, > according as the replacement text contains short reference > delimiters, demanding multiple versions of entities according to > context. > This seems to defeat the purpose of the simple translation, and I can > readily foresee problems in this regard. Ugh. This appears to be impossibly complicated. Can this really happen? > I think benefits may be reaped from conservative use of short reference delimiters in > entities thus referenced, or making them data entities with CDATA or SDATA. > This is not only true for your particular questions, but in general, > since it may not be intuitive when declaring an entity to what short > reference delimiters may map where the entity is referenced. > . . . > To which I conclude that it's probably best with CDATA for entities > which contain short reference delimiters. In fact, I think CDATA > should be used whenever there is no specific need to have the entity > parsed, but this may be overly restrictive. > Is this a feasible compromise, Jonathan? It is if I understand it correctly. Are you saying that the following "works correctly": "Correctly" in the sense that the string aaa would become simply aa ?? And, that the reason for this is is CDATA indicates that the replacement string is unparseable character data? > There is another thing with the USEMAP declaration which I discovered > while reading up on this. One can say and #EMTPY>, but once that is done, there is no way to restore the map > except by knowing which map was specified. The #RESTORE found with > USELINK would have been handy. I remember reading about this in the handbook. Thanks for the help. > JR P.S. Now I see why you use "|" instead of ">" in replying. P.P.S. The line eater wants these lines. Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... Here they are ... I hate this software.