Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!wuarchive!zaphod.mps.ohio-state.edu!usc!apple!snorkelwacker.mit.edu!bloom-beacon!eru!hagbard!sunic!nuug!ifi!enag From: enag@ifi.uio.no (Erik Naggum) Newsgroups: comp.text.sgml Subject: Re: Record boundaries in SGML Message-ID: Date: 29 Nov 90 00:20:11 GMT References: <1990Nov21.210152.2631@maytag.waterloo.edu> Sender: enag@ifi.uio.no (Erik Naggum) Organization: Naggum Software, Oslo, Norway Lines: 72 Nntp-Posting-Host: hild.ifi.uio.no In-Reply-To: garyp@csg.waterloo.edu's message of 21 Nov 90 21:01:52 GMT Originator: enag@hild In article <1990Nov21.210152.2631@maytag.waterloo.edu>, Gary Pianosi writes: I am able to import hand-edited files into Author/Editor without error, but when I try to validate the document, I get the error message: "Validation Error: Text not allowed here" wherever there is a new line between an end tag and a start tag. (I can delete these 'characters' to get rid of the error message, but there are too many of them.) Strangely, no error is reported for new lines between adjacent start tags or end tags, even though text is not valid between the tags. I should mention that the hand-edited files can be been validated using MARK-IT and the canonical output can be read into Author/Editor without any problems. I've to reply to Gary directly (to , the !@#$%^&* mailer didn't recognize itself as "csg.waterloo.edu), quoting section 7.6.1 from SGML (with the amendment applied), but won't post that for petty copyright violation reasons. The problem can be reduced, I think, to the problem of the treatment of Record End in this contrived example: 1 `' 2 `' 3 `caninus' 4 `' 5 `' 6 `felinus' 7 `' 8 `' (where ` signifies Record Start, ' Record End for clarity, line numbers for reference, only) According to section 7.6.1, this will be interpreted at the outer (foo) level as: 1 `' 2 `...' 5 `...' 8 `' Now, the RE in line 1 is clearly the first RE in the content of foo, and the RE in line 5 is clearly the last RE in the content of foo. According to said section, these are to be ignored. The problem is the RE in line 2, and the question boils down to this: Is this RE recognized as /content/ or as /markup/? I believe I understand this to be markup, and thus that it should be ignored. It seems that Gary's problems stem from some decision amounting to viewing this as content, in which the RE would imply the start of a bar element, in which a new bar element is illegal (see amended note to section 11.2.4), or in which data content is not valid. What am I missing here? (I'm sure it's something.) I've read the spec several times, but won't claim that I understand and remember every single thing, due to the high number of references and other spaghetti-coding style writing. -- [Erik Naggum] Snail: Naggum Software / BOX 1570 VIKA / 0118 OSLO / NORWAY Mail: , My opinions. Wail: +47-2-836-863 --