Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uunet!snorkelwacker.mit.edu!bloom-beacon!eru!hagbard!sunic!sics.se!ifi.uio.no!enag
From: erik@naggum.no (Erik Naggum)
Newsgroups: comp.text.sgml
Subject: Re: Short Ref's Reparsed?
Message-ID: <82-7640-006-X/0003@naggum.no>
Date: 19 Jun 91 14:36:47 GMT
References: <448@salt.bellcore.com>
Sender: enag@ifi.uio.no (Erik Naggum)
Reply-To: Erik Naggum
Organization: Naggum Software, Oslo, Norway
Lines: 76
Nntp-Posting-Host: gyda.ifi.uio.no
In-Reply-To: jxr@thumper.bellcore.com's message of 17 Jun 91 02: 14:40 GMT
Originator: enag@gyda.ifi.uio.no
Jonathan Rosenberg writes:
|
| If you specify a short reference map, e.g.,
|
|
|
|
| what happens to the modified text after the substitution is made?
Please note a conceptual difference between "substitution" and entity
references. When the parser sees an entity references, it invokes an
alternate input source from the entity manager, which then delivers
characters from this source until it ends and then sends an Entity end
signal. After that, the entity manager continues to deliver characters
from the entity which contained the entity reference.
Short references are mapped to entity references, but not by way of
rescanning the input stream. If, however, a document with short
references are sent to a system which cannot handle short references,
they are replaced by the corresponding general entity reference.
Your example does not say what you intended. The short reference
mapping declaration refers to an entity "a", but you have defined
entity "aa". If you meant
and
aaa
this would map the first, longest occurence of a short reference
delimiter ("aa") to a general entity reference to entity "aa", and the
letter "a" and then the Ee signal would be parsed, as read from the
entity manager. Parsing then sees the following "a", and the Ee
intervenes so no recursive mapping can take place.
If you meant
and
aaa
the first "aa" would map to the entity a, "aa" would be parsed, and at
this point there could be (1) an infinite recursion, or (2) the entity
could be read as "aa" by itself. Then, there would be an Entity end,
which intervenes between the second "a" of the entity and the third
"a" in the data, so no mapping would take place.
I don't know for certain whether case (1) or (2) is the right one. If
there is a provision in the standard for short reference mapping only
in the entity in which the element in which the mapping is declared to
be used, or invoked, this would indicate case (2). If there is no
such provision, case (1) applies, unless there are other provisions
which preclude recursive mappings. Once again, I will have to spend a
few hours with the Handbook to answer this question. I would tend to
think that defining such a potentially recursive mapping is slightly
silly, if not for the sole purpose of showing that a vendor's parser
is broken. :-)
The difference between "substitution" and "entity reference" is very
important. I spent a lot of time fighting this one, until Charles
Goldfarb showed me the right direction. An entity reference is really
a request to the entity manager for input from an alternate source of
input (his terms, very accurate), not a substitution. The parser only
reads a stream of characters from the entity manager, and does not
rescan previously read input. The entity manager does not have such a
capability in the first place.
Hope this helps.
--
Erik Naggum Professional Programmer +47-2-836-863
Naggum Software Electronic Text
0118 OSLO, NORWAY Computer Communications