Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!mcsun!news.funet.fi!fuug!sics.se!ifi.uio.no!enag
From: erik@naggum.no (Erik Naggum)
Newsgroups: comp.text.sgml
Subject: Re: Short Ref's Reparsed?
Message-ID: <82-7640-006-X/0008@naggum.no>
Date: 22 Jun 91 13:40:50 GMT
References: <464@salt.bellcore.com>
Sender: enag@ifi.uio.no (Erik Naggum)
Reply-To: Erik Naggum
Organization: Naggum Software, Oslo, Norway
Lines: 102
Nntp-Posting-Host: gyda.ifi.uio.no
In-Reply-To: jxr@thumper.bellcore.com's message of 19 Jun 91 20: 01:30 GMT
Originator: enag@gyda.ifi.uio.no
I have consulted The SGML Handbook, and received a note from Goldfarb
himself, both of which clearly conclude that in your example case, you
would achieve infinite recursion. There is no specific provision in
the standard to preclude recursion.
If, however, you really wish the replacement text to contain the short
reference delimiter(s), you can specify the entity to be character
data, as in
Jonathan Rosenberg writes:
|
| Ok. but, what happens in the following case:
|
|
|
|
|
| and
| aaa
| ???
1. "aaa" maps to "&aaa;"
2. "&aaa;" produces "aa"
3. "aa" maps to "&aa;"
3. "&aa;" produces "a"
so this becomes "a", only.
| I think that I found a clause in the standard that outlaws
| recursive applications of short reference maps in any case.
| Section 9.4.6.1 (page 354 of the Handbook) says (in part):
|
| "A short reference can be removed from a document by replacing
| it with an equivalent reference string that contains a named
| entity reference. The entity name must be that to which the
| short reference is mapped in the current map."
|
| This says clearly to me that given the above
| aaa
| is equivalent to
| &aaa;Ee
^^-- not here
| which will (eventually) become
| aa
^-- but here
| and not
| a
I agree that this looks possible, but the entity referenced (aaa) also
needs to map occurrencies of short reference delimiteres ("aa") to
entity references (aa) inside it, since it would have been parsed like
this had the short references been used. I haven't found any way to
inhibit short reference delimiter recognition inside an entity which
was referenced by a short reference instead of a general reference;
but it seems to be quite necessary:
a corollary is that translation from short references to entity
references may modify the contents of entities referred to in the
replacement text of the entity referenced by the short reference,
according as the replacement text contains short reference
delimiters, demanding multiple versions of entities according to
context.
This seems to defeat the purpose of the simple translation, and I can
readily foresee problems in this regard. I think benefits may be
reaped from conservative use of short reference delimiters in entities
thus referenced, or making them data entities with CDATA or SDATA.
This is not only true for your particular questions, but in general,
since it may not be intuitive when declaring an entity to what short
reference delimiters may map where the entity is referenced.
E.g.,
...
...
... &issue; ... => Vol 5 [libra]6
To which I conclude that it's probably best with CDATA for entities
which contain short reference delimiters. In fact, I think CDATA
should be used whenever there is no specific need to have the entity
parsed, but this may be overly restrictive.
Is this a feasible compromise, Jonathan?
There is another thing with the USEMAP declaration which I discovered
while reading up on this. One can say and , but once that is done, there is no way to restore the map
except by knowing which map was specified. The #RESTORE found with
USELINK would have been handy.
--
Erik Naggum Professional Programmer +47-2-836-863
Naggum Software Electronic Text
0118 OSLO, NORWAY Computer Communications