Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!iuvax!ndcheg!uceng!dmocsny From: dmocsny@uceng.UC.EDU (daniel mocsny) Newsgroups: comp.ai Subject: Writing style analyzers (Was: Re: Elementary AI Philosophy) Summary: Readability. Keywords: Understanding and Comprehension, Reality and Modeling, Sentience Message-ID: <667@uceng.UC.EDU> Date: 9 Feb 89 15:58:56 GMT References: <18464@santra.UUCP> <1241@arctic.nprdc.arpa> <904@ubu.warwick.UUCP> <44587@linus.UUCP> Organization: Univ. of Cincinnati, College of Engg. Lines: 61 In article <44587@linus.UUCP>, bwk@mbunix.mitre.org (Barry W. Kort) writes: > Well, Dan, I've been using WWB (Writer's WorkBench) ever since the first > version came out of Murray Hill, and at least some of your vision is > already a reality. I have used Rightwriter quite a bit in the PC environment. I don't know how it compares to WWB (not available on any UN*X boxes that I can access here), but it is useful. (It does tend to gag on LaTeX markup commands, though, throwing off the readability index it computes at the end...) However, Rightwriter doesn't go nearly so far as I would like. It does spot a few things well, such as passive voice and possibly useless phrases ("in order to," instead of "to"). It doesn't identify several of the leading causes of useless sentence complexity. For example, Rightwriter will accept without complaint either of the two equivalent sentences: 1. The thermometer measures the temperature. 2. It is the thermometer which is that which serves to accomplish the measurement of the temperature. Both humans and computers take longer to "understand" the second sentence. Unfortunately, sentences like that are more like the rule than the exception in the technical literature. Rightwriter does not detect (1) pronouns that precede their referents, (2) noun phrases that are equivalent to (simpler) action verbs, or (3) unnecessary helping verbs. The second sentence displays all three, and yet escapes with a clean bill of health. (For more examples and advice, see John Brogan, "Clear Technical Writing," McGraw-Hill, 197(3?).) When Rightwriter does flag a "complex sentence," it does not attempt to simplify it, or even give any hints. This is because it probably does not do any semantic analysis. Modern grammar and style checkers are useful tools (as useful as spelling checkers, I believe). However, their full utility won't be evident until they (1) "know" more about what makes writing unnecessarily complex, and (2) attempt to "understand" the text they analyze. I suppose the second goal would require "extracting" the underlying "facts" in a writing sample and compiling them in some sort of a formal knowledge structure. This might allow the program to render the stored facts into text with simpler sentence structure. Since such a "logical" approach promises to be difficult, perhaps we should explore alternatives. Do any subscribers to this newsgroup know of any attempts to train a neural network to simplify sentences? John Brogan's book has many examples of sentence pairs similar to the one I gave above. After I worked through all his examples, I became uncannily aware of the sentence structures he advises against. Now my colleagues give me papers to proofread, and I sail through them with a red pen. I have wondered whether a neural network could feasibly learn to recognize and correct unnecessary sentence complexity after exposure to such a training set. This should not be difficult to do over a restricted grammar. Cheers, Dan Mocsny dmocsny@uceng.uc.edu