Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!usc!wuarchive!mit-eddie!bloom-beacon!eru!hagbard!sunic!liuida!prosys!ath
From: ath@prosys.se (Anders Thulin)
Newsgroups: comp.text.sgml
Subject: Re: Is there a DTD standard?
Message-ID: <592@helios.prosys.se>
Date: 13 Sep 90 06:49:54 GMT
References: <141829@sun.Eng.Sun.COM> > <EMV.90Sep6191207@urania.math.lsa.umich.edu> <8027@mcshh.hanse.de> <1990Sep11.163347.7593@zorch.SF-Bay.ORG> <1990Sep11.193327.19935@terminator.cc.umich.edu> <1990Sep12.020242.2916@cs.rochester.edu> <BZS.90Sep1122501
Organization: Telesoft AB, Teknikringen 2A, S-583 30 Linkoping, Sweden
Lines: 44

In article <BZS.90Sep11225016@world.std.com> bzs@world.std.com (Barry Shein) writes:

>If no one can come forward with an authoritative DTD why don't those
>of us who understand a little about this stuff come up with a DTD
>right here? If nothing else the discussions surrounding that should be
>informative to those trying to learn more. We can call it "The USENET
>SGML DTD" and put it into the public domain. If it seems reasonable
>I'll use it for "The Online Book Initiative", we've been using some of
>our own conventions but it wouldn't be hard to conform.

The 'USENET SGML DTD' is a rather vague description: what types of
texts should it be used for? RFC's, Email, Digests, ... ?  Or more
traditional types like novels, collections of short stories, dramas
etc?

My own suggestion would be for something like novels. Most people have
read at least one, so they wouldn't be entirely unfamiliar :-) And it
would also fit rather nicely with OBI ...

Choosing a rather restrictive text type could also simplify some of
the keyboard conventions: Paragraph breaks could probably be indicated
by empty lines, dashes could be '---', quotes could use `` and ''. Of
course, this assumes that the system used for parsing would handle
shortrefs and the whatnots that are required.

>And my first question:
>
>	Are we happy with the convention &char-name to encode non-ascii
>	characters (e.g. &oumlaut), how far along is the Text Encoding
>	Initiative with this? Can we use their conventions yet?

I am happy with it. It seems to be largely based on the entity sets
(for Latin-1 and Latin-2) published in one of appendices of the ISO
SGML document - which probably means they would be available for most
SGML implementations.  Or is there any reason to avoid them?

I imagine that an SGML translator would be capable of converting a
document using a local concrete syntax to the either of the reference
syntaxes defined by SGML. So choosing other conventions should'nt be
much of a problem. Or am I mistaken?

-- 
Anders Thulin       ath@prosys.se   {uunet,mcsun}!sunic!prosys!ath
Telesoft Europe AB, Teknikringen 2B, S-583 30 Linkoping, Sweden