Xref: utzoo comp.ai:2389 comp.lang.prolog:1332 Path: utzoo!utgpu!water!watmath!clyde!att!rutgers!mailrus!cornell!uw-beaver!teknowledge-vaxc!sri-unix!quintus!ok From: ok@quintus.uucp (Richard A. O'Keefe) Newsgroups: comp.ai,comp.lang.prolog Subject: Re: Concept Learning & ID3 (Quinlan) - in prolog Message-ID: <543@quintus.UUCP> Date: 16 Oct 88 01:39:59 GMT References: <395@uiag.UUCP> <39500@aero.ARPA> Sender: news@quintus.UUCP Reply-To: ok@quintus.UUCP (Richard A. O'Keefe) Organization: Quintus Computer Systems, Inc. Lines: 58 In article <39500@aero.ARPA> mcguire@aero.aerospace.org (Rod McGuire) writes: >In article <395@uiag.UUCP> gerrit@uiag.UUCP (Cap Gerrit) writes: >>Does anybody out there in the world has an implementation of >>the ID3 algorithm of Quinlan >Prolog is one of the worst possible languages in which to write the >ID3 algorithm. I don't know if it can be written to be efficient. >The problem is that one needs to count up statistics in parallel and >then somewhat randomly access these statistics. While in prolog it >is possible to kludge up arrays with mutable elements, There is not the slightest need for mutable arrays in an implementation of ID3 or similar algorithms, and Prolog is not at all a poor choice. The Iterative Dichotomiser involves two kinds of steps: (1) making a sweep through the training set collecting a random sample of *incorrectly predicted* examples to add to the "window" (this is the "Iterative" part) (2) doing a kind of back-to-front radix sort on the contents of the "window" to turn it into a decision tree (this is the "Dichotomiser" part) >Since the ID3 algorithm is usually applied to a moderate amount of >data (let's say, to construct a tree discriminating 10 classes from >analysis of 5000 attribute vectors each with 20 attributes that can If you are working with tiny training sets like that, you don't need ID3. Quinlan's innovation was the "windowing" technique -- forming decision trees is nothing new, you will even find a tiny Fortran implementation in Algorithm AS 165, JRSS -- which permitted him to work with training sets having millions of examples. The "windowing" idea can be applied to many induction schemes: {Initialise} set the window to a random sample of N1 examples. {Induce} induce a "rule" from the current window. {Evaluate} make a pass through the complete training set, adding a random sample of N2 examples which are incorrectly classified (if fewer than N2 misclassifications, take all). If performance was adequate, stop. {Iterate} Go back to {Induce}. Two references: "Discovering rules by induction from large collections of examples", Quinlan, J.R. (in) Expert Systems in the micro-electronic age, (Ed. D. Michie) Edinburgh University Press, 1979 "Learning efficient classification procedures", Quinlan, J.R. (in) Machine learning: an artificial intelligence approach Eds Michalski, Carbonell, & Mitchell Tioga press, 1983 The algorithm descriptions in these articles are quite clear.