Path: utzoo!news-server.csri.toronto.edu!cs.utexas.edu!asuvax!noao!rstevens
From: rstevens@noao.edu (Rich Stevens)
Newsgroups: comp.text
Subject: Re: Quality of computer typesetting?
Summary: my experiences (long)
Message-ID: <1991Mar15.134229.23183@noao.edu>
Date: 15 Mar 91 13:42:29 GMT
References: <1991Mar14.215651.13961@dartvax.dartmouth.edu>
Organization: National Optical Astronomy Observatories, Tucson AZ
Lines: 107

Well, I'm not a typesetter or a book designer, just a picky author.
You might want to take a look at my recent book, "UNIX Network
Programming" (Prentice Hall, 1990).  (It's been out for 14 months
now, so most technical libraries should have it.)  I used troff,
DWB 2.0 to be exact, and had a lot of problems that I had to work
around.  Here's the saga.

First, troff's hyphenation algorithm isn't perfect.  I ended up
having to make a hyphenation exception list with about 150 words
in it.  But, vanilla troff only allows an exception list of
128 bytes, so I had to take the sources and increase the list
size.  One advantage of having the sources.  Finding all the
hyphenation errors was a pain.  What I had to do was write some
shell scripts that ran troff with the -a flag, find all the
hyphenated words and look them up in my own list to see if I'd
ever encountered that work hyphenated before.  If not I printed
the word then by hand looked it up in the dictionary and entered
the word into my list with its hyphenation points.  I looked around
for machine-readable dictionaries that included hyphenation points
but couldn't find any.  (I'm still not aware of any that could help
me with this specific problem.)

Also, troff doesn't handle words in the exception list correctly
if the word contains a ligature.  So words like "specification"
that it doesn't hyphenate correctly can't be put in the exception
list.  You have to find them and insert the hyphenation character
(\%) at all the right points.  Another script.

Next problem is that troff hyphenates too many lines in a row, so I
had to write another script that looked for more than 2 lines in a row
that were hyphenated.  I often went in by hand and rewrote a few words
to get around these.

Next problem was actual page layout.  My book had *lots* of figures
and listings, some of which had to stay entirely on one page.  Just
as I was finishing the book the article by Kernighan and Van Wyk
appeared on their troff post-processor that did this.  Unfortunately
the software wasn't generally available.  (It's still not really
available, in my mind, since it takes $20,000 to get the source
and no one is selling binaries.  Also, I'm not positive that the
page maker software is in DWB 3.1?)  What I ended up doing was
going through the entire text (700+ pages) and forcing page breaks
where I wanted them.  You do page 1, get it right, then page 2,
then page 3, ...  Fortunately I had a nice terminal to do this on
(an AT&T 630) but it was a long process.  I swore I wouldn't do that
again.  A few times I ended up rewording something just to get the
page break where I wanted it.

The next problem is the placement of dashes with current fonts.
I don't like the placement of hyphens, en-dashes, or em-dashes
with most fonts.  Hyphens are set at the x-height for lower case
characters, which is fine for hyphenation at the end of the line,
but looks awful (IMHO) in words such as MS-DOS.  (Fortunately only
a few MS-DOSs in a Unix book!)  Also, with the font I used (PostScript
Times Roman) the en-dash and em-dash touched most letters on either
side of the dash.  I didn't like that, or the fact that an en-dash
between two digits is really an en-dash between two upper case
characters (so it should be a little higher).  I ended up writing
scripts that "fixed" all this by moving certain characters around
before troff got a hold of them.  Parentheses are another similar
problem, as most fonts have parens set for lower case characters.

My final gripe about troff based systems is the lack of any
"grammar software".  I had diction (the old, old, BSD-distributed
version) but have to wade through so many false hits with it
that I don't use it much.  I have some bad writing habits that
I want to find automatically.  My solution to this is yet-another
script that looks for certain phrases that I've accumulated over
time, and prints the offending line.  Things such as starting a
sentence with "However", using words like "essentially", etc.
A nice grammar checking package for Unix would be nice to have.
(My gut feeling is they might give me more false hits than diction,
but it would be nice to at least have the option of these types
of packages.)

Would I use troff again -- yes, but I'm now using groff.  It has
TeX's hyphenation algorithm which saves me a lot of problems, and
has an option to limit the number of lines in a row that it
hyphenates.  The font problems I still have to deal with, but
I've done it before.  Page layout is still a problem.  I may end
up writing a simple version of the Kernighan & Van Wyk package
(I don't need multi-column formatting, for example) if I have the
time before the next book is finished.

I'm not a TeX user mainly because I've been using troff for almost
14 years and am familiar with it.  Also, I don't think TeX by
itself generates good books.  Knuth's books look great, but I've
seen two books by another author that were made using TeX that
look pretty bad.  I think 90% of the final appearance depends on
the author, not the system that was used.  I end up doing so much
pre-processing and post-processing of troff's input and output,
that I don't think using a package like Frame is my answer either.
(It's funny, when I picked up the first Unix book I'd seen that
specifically said it was made with Frame and not troff, the second
page I looked at had a blatant hyphenation error.  Made my day.)

The best looking troff-generated books that I've seen are Van Wyk's
"Data Structures and C Programs" and another one by Ravi Sethi
(that I can't remember the name of).  I think both were made using
the  K & VW page maker software.  The 10th Edition Unix manuals from
Murray Hill also look very nice.  Some of the worst troff books I've
seen have actually been printed by a laser printer at 300 dpi.
Yuck.  It only costs about $5/page to typeset a book that's already
in PostScript, so there's really no excuse for laser printed books
today.

	Rich Stevens (rstevens@noao.edu)