Path: utzoo!utgpu!watserv1!watmath!att!att!dptg!ulysses!andante!mit-eddie!media-lab!minsky From: minsky@media-lab.MEDIA.MIT.EDU (Marvin Minsky) Newsgroups: comp.ai Subject: Re: What Has Traditional AI Accomplished? Message-ID: <3740@media-lab.MEDIA.MIT.EDU> Date: 19 Oct 90 14:51:13 GMT References: <69609@lll-winken.LLNL.GOV> <1990Oct15.143325.26044@unislc.uucp> <1990Oct16.135631.6444@cbnewsj.att.com> <69929@lll-winken.LLNL.GOV> Reply-To: minsky@media-lab.media.mit.edu (Marvin Minsky) Organization: MIT Media Lab, Cambridge MA Lines: 133 In article <69929@lll-winken.LLNL.GOV> loren@tristan.llnl.gov (Loren Petrich) writes: > The simplicity of the basic algorithms keep making me wonder >why NN's did not take off earlier -- the basic code for one takes up >only a couple pages of Fortran or C. Try writing one yourself. I guess >that (in)famous book by Minsky and Papert, _Perceptrons_, with its >seemingly airtight theoretical arguments, is what had squelched the >field for so long. DAMMIT. Try reading the book. What happened was that the field had already flattened out, because, although Perceptrons could learn to recognize certain patterns, they seemed unable to learn some other kinds of patterns. The book explicitly analyzes "three layer nets" -- input layer / coefficients / hidden layer / coefficients / and single neuron output. But, in fact, most theorems apply to unrestricted multilayer, loop-free nets. This does not seem to be well-known. I assumed it was obvious. Since no one has found any errors in those "seemingly airtight theoretical arguments", you should try to understand what point you're missing! It seems strange that I should have to do explain this in comp.ai, at this late date. "Perceptrons" explained that it will be hard for such nets to recognize, for example, certain kinds of group-invariant recognitions, without duplicating hardware for every element of the group. EXAMPLE: in a simple 100 x 100 square retina, recognize all the images that could be reasonably described as depicting "A SQUARE INSIDE A CIRCLE". Loren and others are absolutely right, in that the 80's showed that ML (multilayer) nets could be made to learn many useful patterns. "Perceptrons" was concerned with patterns that MLs couldn't learn, not ones they could!!!!!!!!!! So no collection of exciting stories of MLs learning things counters the problems with what they can't learn -- like those distance invariant relationships between parts of images. In many cases, "successful" applications of MLs depend on pre-processing a picture image, by first normalizing it in size, and then centering it, before presenting it to the ML. Fine - but don't tell people that this refutes the Minsky-Papert theorems. Instead, now try todo that "circle-ionside-square" problem! And then realize that many real-world problems require multiple normalizations, which cannot be pre-computed until you have picked out the sub-patterns. In that connection, there is wisdom in Thomas G Edwards' remarks in <6664@jhunix.HCF.JHU.EDU>: ... Cascade-Correlation is a NN algorithm which is able to solve many problems which were difficult for homogenous NNs to solve. ... I see a future where inductive learning by small homogeneous NNs is used in combination with more traditional AI type goal building. Cascade-Correlation is a step in that direction. Divide-and-conquer of traditional AI is combined with the easy inductive learning of traditional NNs. Of course, the trick is to couch this in a connectionist framework to continue to allow for fast parallel computation. Divide-and-conquer is surely needed for circle-inside-square. Note that we still don't nkow how the brain does it. Get with it, guys! Of course there are many exciting things that can be done with ML networks. A good deal of the brain is made of them. And there is a lot that require non-ML networks, and a lot of the brain is non-ML. Instead of bashing "Perceptrons", you should use it as a model, and try to find more general statements about what ML and other networks can do, and what are their limitations. What we don't need are intemperate remarks like those in , who seems to deliberately misinterpret everything I have said in this group and other places. I don't know why he's so angry at me. For example, in one message to this group I said: "... Where is the "traditional, symbolic, AI in the brain"? The answer seems to have escaped almost everyone on both sides of this great and spurious controversy! The 'traditional AI' lies in the genetic specifications of those functional interconnections: the bus layout of the rel A large, perhaps messy software is there before your eyes, hiding in the gross anatomy. Some 3000 "rules" about which sub-NN's should do what, and under which conditions, as dictated by the results of computations done in other NNs...." Pollack replied, with this weird objection "I have to admit this is definitely a novel version of the homunculus fallacy: If we can't find him in the brain, he must be in the DNA! Of all the data and theories on cellular division and specialization and on the wiring of neural pathways I have come across, none have indicated that DNA is using means-ends analysis." And then, he proceeded to make the same points that I have been making, as though it were different from what I was saying: "Certainly, connectionist models are very easy to decimate when offered up as STRONG models of children learning language, of real brains, of spin glasses, quantum calculators, or whatever. That is why I view them as working systems which just illuminate the representation and search processes (and computational theories) which COULD arise in natural systems. There is plenty of evidence of convergence between representations found in the brain and backprop or similar procedures despite the lack of any strong hardware equivalence (Anderson, Linsker); constrain the mapping task correctly, and local optimization techniques will find quite similar solutions. It is the same thing again. Yes, you can find things nets do, but it's like bad statistics in which you don't describe what you're testing for until after the experiment is done. Let's see an ML solve circle-in-square. Let's see one of Pollack's massively parallel parsers solve circel in square. Without any "strong hardware" pre-figuring of the network. In fact, Pollack's next paragraph begins with "Furthermore, the representations and processes discovered by connectionist models may have interesting scaling properties and can be given plausible adaptive accounts." Is he angry at me because the required scaling properties for human visual perception are not among those posessed by the NN models he advocates? I don't know, by there must be some reason for his rage? He finishes with, "On the other hand, I take it as a weakness of a theory of intelligence, mind or language if, when pressed to reveal its origin, shows me a homunculus, unbounded nativism, or some evolutionary accident with the same probability of occurrence as God. Is this a paraphrase of the beginning of "Society of Mind", or does Pollack think it is opposing it. Come on Jordan. We're on the same side. Yet you have been writing the most hostile and savage reviews of my work. What's the deal here?