Path: utzoo!mnetor!uunet!husc6!mailrus!ames!pasteur!ucbvax!ADS.COM!stuart%warhol
From: stuart%warhol@ADS.COM (Stuart Crawford)
Newsgroups: comp.ai.digest
Subject: Re: Exciting work in AI
Message-ID: <3671@zodiac.UUCP>
Date: 2 May 88 20:42:46 GMT
Sender: daemon@ucbvax.BERKELEY.EDU
Reply-To: stuart@ads.com (Stuart Crawford)
Organization: Advanced Decision Systems, Mt. View, CA (415) 941-3912
Lines: 52
Approved: ailist@kl.sri.com


Wray Buntine (wray@nswitgould.oz)  writes:
> Ross's original ID3 work (and the stuff usually reported in Machine Learning
> overviews) and much subsequent work by him and others (e.g. pruning)
> actually fails the "real AI" test.  It was independently developed by
> a group of applied statisticians in the 70's and is well known
>       Breiman, L., Friedman, J.H., Olshen, R.A. and Stone, C,J. (1984)
>       "Classification and Regression Trees", Wadsworth
> Ross's more recent work does significantly improve on Breiman et al.s stuff.
                          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

How?  If you mean his stuff on generating production rules from decision trees,
I think you're missing the point of CART.  It seems to me that simply
transforming decision trees into production rules is a rather uninteresting
exercise.  Quinlan tries to motivate the idea by suggesting that the generated
rules are an "improvement" over the induced tree because they are both easier
to interpret and more parsimonious.  I disagree that they are easier to
interpret, and they are more parsimonious only if your original induction
algorithm has not already pruned the tree.  Using production rule generation as
an alternative to tree pruning strikes me as the wrong approach.  I still feel
that CART is the induction procedure of choice because of the following:

1. generates parsimonious trees
2. handles noisy, incomplete data
3. strong, well understood, asymptotic properties
4. allows user-defined priors and cost-functions
5. delivers attribute-importance diagnostics
6. can induce rules incrementally
7. delivers low bias, low variance estimates of misclassification rate

For references on 1-5, see Brieman et al. (1984), and for 6,7 see Crawford, S.
"Extensions to the CART Algorithm", proceedings Knowledge Acquisition for
Knowledge-Based Systems workshop (1987).

I also find somewhat curious Buntine's suggestion that Quinlan's most recent
work, 

> is closer to real AI (e.g. concern for comprehensibility),
> though it still has an applied statistics flavour.

I would suggest that the work has an applied statistics flavor because it is
attempting to solve an applied statistics problem.

--------------------------------------
Stuart Crawford
stuart@ads.com
Advanced Decision Systems
1500 Plymouth Street
Mountain View, CA 94043


Stuart