Path: utzoo!utgpu!water!watmath!watdragon!watsol!tbray
From: tbray@watsol.waterloo.edu (Tim Bray)
Newsgroups: comp.text
Subject: SGML defended (Long)
Message-ID: <7986@watdragon.waterloo.edu>
Date: 25 Jul 88 17:50:53 GMT
References: <61024@sun.uucp>
Sender: daemon@watdragon.waterloo.edu
Reply-To: tbray@watsol.waterloo.edu (Tim Bray)
Organization: New Oxford English Dictionary Project, U. of Waterloo, Ontario
Lines: 111
In article <61024@sun.uucp> tut%cairo@Sun.COM (Bill "Bill" Tuthill) writes:
> I'm moving a discussion of SGML started in comp.text.desktop into this
> newsgroup, because I think the issues are larger than a desktop.
and so on.
I have to disagree with nearly every line of Bill Tuthill's contribution.
There are real problems with SGML, but they are not the ones he
identifies. I think the problem is that he considers SGML strictly as a
typesetting system, which is really beside the point. Detailed
discussion follows, but the important points are:
1. If any on-line use for a document other than printing it out
(Hypertext, information retrieval, on-line documentation) is
contemplated, structural rather than typographical markup is a
necessity. The arguments for this are many and are overpowering in
their force. Rather than run through them, I refer everyone to the
excellent article `Markup Systems and the Future of Scholarly Text
Processing', by Coombs, Renear, and DeRose in the Nov. '87 CACM.
2. The SGML standard is a crock. I have not read it, but this is the
unanimous consensus of everyone I know who has tried to work with it.
The basic SGML syntax and concepts, however, are sound. I think the
logical conclusion should be: let's not let the failure of the standards
drafters deter us from using this basically good idea.
Now, to address Mr. Tuthill's points:
>Instead, SGML should be compared to decent
>procedural languages such as troff and TeX. There are good reasons why
>troff and TeX macro packages were invented: well-designed macros provide
>writers with a descriptive layer ...
No, SGML shouldn't be compared to these things. SGML and the
typesetting packages exist to solve different problems. When you want
to typeset your SGML document, you should translate it into troff or TeX
or PostScript or something that's good at that job. SGML exists to
prevent typographical nits from getting in the way of structural
document design decisions. See the CACM article.
>SGML is no panacea for portability. Being a metalanguage, SGML does not
>provide one syntax, but only a method for describing different syntaxes.
>On p. 68 Goldfarb states, "SGML allows variant concrete syntaxes." This
>is tantamount to saying it isn't really standard. It would probably be
>as difficult to translate between variant syntaxes as to translate between
>troff and Interleaf or Frame.
The great virtue of SGML is that it is very easy for computers to parse
and is probably the most flexible form in which it is possible to store
text. Our practical experience on the New OED project is that the first
thing to do with input text is to do away with all the typesetting
gibberish and get some approximation of SGML tags in there. You don't
have to worry too much about getting them right; once the basic
structure is there, it's remarkably easy to transform the text into the
right setup, once you figure out what that should be.
>SGML was born obsolete. Graphics are missing from the specification, as
>are provisions for tables and equations.
It is certainly possible in SGML to make a reference to an
externally-stored graphic. Then at typesetting time, you copy in the
appropriate PostScript/pic/rasterfile or whatever. SGML does indeed
allow the specification of tables and equations, in a
typography-independent way that lends itself to a variety of
information-retrieval applications. Try to make automatic sense out of
tbl or eqn source! On the other hand, it's easy to translate SGML
structures *into* tbl or eqn or whatever.
>SGML:
> This added information, called markup
, serves two purposes:
>