Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!utgpu!water!watmath!clyde!rutgers!ames!ucbcad!ucbvax!aimmi.UUCP!gilbert
From: gilbert@aimmi.UUCP.UUCP
Newsgroups: comp.ai.digest
Subject: Grammar Checkers
Message-ID: <14@aimmi.UUCP>
Date: Sat, 9-May-87 14:17:44 EDT
Article-I.D.: aimmi.14
Posted: Sat May  9 14:17:44 1987
Date-Received: Thu, 14-May-87 05:56:05 EDT
References: <AIList-REQUEST.at.STRIPE.SRI.COM> <MINSKY.12299573623.BABYL@MIT-OZ>
Sender: daemon@ucbvax.BERKELEY.EDU
Reply-To: aimmi!gilbert (Gilbert Cockton)
Distribution: world
Organization: Heriot-Watt/Strathclyde Alvey MMI Unit, Scotland
Lines: 97
Approved: ailist@stripe.sri.com

In article <MINSKY.12299573623.BABYL@MIT-OZ> MINSKY@OZ.AI.MIT.EDU writes:
>I agree with Todd, Ogasawara: one should not criticise to extremes.

What does this mean? I thought accuracy was the only goal in
criticism, not avoiding the ends of some quaint invented continuum.
Can we have a style checker which rates our extremity with marks out
of 10 (0 for credulous and 10 for rampant scepticism perhaps :-))

> I also used it to establish a "gradient".  The early
>chapters are written at a "grade level" of about 8.6 and the book ends
>up with grade levels more like 13.2 - using RightWriter's quaint
>scale.

How about MIT turning some of its resources towards VALIDATING this
quaint gradient? Do you seriously think there is any real computable ordering,
partial or otherwise, which can be applied to your chapters and
actually square up with any of our everyday evaluations of text
complexity? If so, where's the beef? How would US data square up with
European data. English teachers in the UK, for example, do not apply
unimaginative inflexible rules to students' writing, so it could be
that many educated English students will be turned off by an 8.6
introduction. Luckily we have not yet been carried away with the
belief that all complex ideas can have banal presentations without
bowdlerisation creeping in. Doubtless your style checker would ask me
to drop 'bowdlerise'? What should I have used instead, given that I
want an EXACT synonym with all its connotations? When I taught,
I would have advised my students to find a dictionary (many of them carried
them anyway - and I taught children from a wide range of cultural and
economic backgrounds). God knows what the French would say to a
mechanical style checker (a Franglais remover would go down well
though).

Finally, how on earth do these style checkers know which words will be
commonly understood? Surely they don't use word frequency in newspapers
or something like that? Does the overuse of a word in the media imply
universal understanding of/consensus on its meaning - eg. 'moral', 
'freedom', 'extreme', 'quaint', 'seriously', 'inflexible' etc?
Does the limited use of a word in the media imply universal ignorance
- eg. 'ok', 'alright', 'balls', 'claptrap', 'space cadet', 'avid',
'stroppy', 'automaton'?

I would not regard any of the criticisms of style checkers I have read
as 'extreme' at all. The difference seems to be one of gross credulity
versus informed criticism. People who know nothing about good style
will believe all the things which the style checker hackers have MADE
UP - I defy any style checker implementor to point to a sound
experimental/statistical basis for the style rules they have palmed
off onto their gullible customers. Perhaps they did at least read some 
books by self-proclaimed authorities, but this would only shift the charge 
from invention to uncritical acceptance. I'd still be unimpressed.

This may sound extreme - that however is irrelevant. The point is, 
am I accurate?. Note that my substantial assertions are few:

	i) Style don't compute. Verify by Chinese characters test
	   between a style checker and the editors of the New Yorker
	   (US) or the Listener (UK). Other quality magazine editors
	   will do. Can you spot the editors' critiques? 

	ii) The current 'reading age' metrics have no validity.
	    They are bogus psychometric tools. Operationally I am
	    saying that their will be no strong correlation (say r >
	    0.9, p < 0.001) between the reading age of text and a
	    reader's performance on a comprehension test. Allow the
	    author to add a glossary and the correlation will weaken.
	    People can learn new words you know.

	iii) Current measures of popular understanding of words are
	     equally bogus and there is NO decent research to back it
	     up. There has been some good work on correlating
	     vocabulary with educational achievement, but this tells
	     us nothing about the typical adult's vocabulary.

Every assertion above is falsifiable, so let's all forget about emotive 
subjective concepts like extremity (= I disagree a lot and wish you hadn't 
said that) and get back to an objective, informed debate. The motion
is:

	"All computer based style checkers can stunt the literary
	 growth of their users"

A second order effect is that, although 1,000 chimpanzees could
between them type out the works of Shakespeare given enough time, they
would fail miserably if their output had to be passed by a computer
style checker.

	To be, or not to be, that is the question.
	>> Sentence starts with infinitive
	   Sentence has no subject.
	Whether it is ....
	>> "Whether" may not be understood by people who just read
	    comics. (? spelling mistake = weather ?).

-- 
   Gilbert Cockton, Scottish HCI Centre, Ben Line Building, Edinburgh, EH1 1TN
   JANET:  gilbert@uk.ac.hw.aimmi    ARPA:   gilbert%aimmi.hw.ac.uk@cs.ucl.ac.uk
		UUCP:	..!{backbone}!aimmi.hw.ac.uk!gilbert