Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!iuvax!ndcheg!uceng!dmocsny
From: dmocsny@uceng.UC.EDU (daniel mocsny)
Newsgroups: comp.ai
Subject: Writing style analyzers (Was: Re: Elementary AI Philosophy)
Summary: Readability.
Keywords: Understanding and Comprehension, Reality and Modeling, Sentience
Message-ID: <667@uceng.UC.EDU>
Date: 9 Feb 89 15:58:56 GMT
References: <18464@santra.UUCP> <1241@arctic.nprdc.arpa> <904@ubu.warwick.UUCP> <44587@linus.UUCP>
Organization: Univ. of Cincinnati, College of Engg.
Lines: 61

In article <44587@linus.UUCP>, bwk@mbunix.mitre.org (Barry W. Kort) writes:
> Well, Dan, I've been using WWB (Writer's WorkBench) ever since the first
> version came out of Murray Hill, and at least some of your vision is
> already a reality.

I have used Rightwriter quite a bit in the PC environment. I don't know
how it compares to WWB (not available on any UN*X boxes that I can access
here), but it is useful. (It does tend to gag on LaTeX markup commands,
though, throwing off the readability index it computes at the end...)

However, Rightwriter doesn't go nearly so far as I would like. It does
spot a few things well, such as passive voice and possibly useless
phrases ("in order to," instead of "to"). It doesn't identify several
of the leading causes of useless sentence complexity. For example,
Rightwriter will accept without complaint either of the two equivalent
sentences:

1. The thermometer measures the temperature.

2. It is the thermometer which is that which serves to accomplish the
measurement of the temperature.

Both humans and computers take longer to "understand" the second
sentence.  Unfortunately, sentences like that are more like the rule
than the exception in the technical literature. Rightwriter does not
detect (1) pronouns that precede their referents, (2) noun phrases
that are equivalent to (simpler) action verbs, or (3) unnecessary
helping verbs. The second sentence displays all three, and yet escapes
with a clean bill of health. (For more examples and advice, see
John Brogan, "Clear Technical Writing," McGraw-Hill, 197(3?).)

When Rightwriter does flag a "complex sentence," it does not attempt
to simplify it, or even give any hints. This is because it probably
does not do any semantic analysis. Modern grammar and style checkers
are useful tools (as useful as spelling checkers, I believe).
However, their full utility won't be evident until they (1) "know" more
about what makes writing unnecessarily complex, and (2) attempt to
"understand" the text they analyze. I suppose the second goal would
require "extracting" the underlying "facts" in a writing sample and
compiling them in some sort of a formal knowledge structure. This
might allow the program to render the stored facts into text with
simpler sentence structure. 

Since such a "logical" approach promises to be difficult, perhaps we
should explore alternatives. Do any subscribers to this newsgroup know
of any attempts to train a neural network to simplify sentences?  John
Brogan's book has many examples of sentence pairs similar to the one I
gave above. After I worked through all his examples, I became
uncannily aware of the sentence structures he advises against. Now my
colleagues give me papers to proofread, and I sail through them with a
red pen.

I have wondered whether a neural network could feasibly learn to
recognize and correct unnecessary sentence complexity after exposure
to such a training set. This should not be difficult to do over a
restricted grammar.

Cheers,

Dan Mocsny
dmocsny@uceng.uc.edu