Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!wuarchive!mit-eddie!media-lab!minsky From: minsky@media-lab.MEDIA.MIT.EDU (Marvin Minsky) Newsgroups: comp.ai.philosophy Subject: Re: Reasoning Paradigms Message-ID: <3593@media-lab.MEDIA.MIT.EDU> Date: 6 Oct 90 04:55:54 GMT References: <9963@ccncsu.ColoState.EDU> <3586@media-lab.MEDIA.MIT.EDU> <69347@lll-winken.LLNL.GOV> Reply-To: minsky@media-lab.media.mit.edu (Marvin Minsky) Organization: MIT Media Lab, Cambridge MA Lines: 128 I agree with most of what loren@tristan.llnl.gov (Loren Petrich) said in article 62. The only problem I have is with his assertion > I feel that there is much more promise in NN's than in traditional > AI, which has been dependent on working out decision rules explicitly. It is not an either-or thing, in my view. NN's are strong in learning to recognize (some) patterns in which something depends on many other things in relatively weak dependencies. NN's can represent such relationships when they have good linear approximations -- but, probably, only in those domains. We don't know a lot about how to characterize them. But lots of human pattern recognition machinery probably uses this. On the other side, the PROCEDURES that can be represented in NN's are very limited, certainly in the non-cyclic nets that dominate the work of the 80s. This means that, without a lot of external script-like control, it will be hard for them to reason about what they have recognized. A careful re-reading of "Perceptrons" will show that virtually all the negative results therein still hold for multi-layer noncyclic networks -- especially theoriems like the AND-OR theorem which show why an NN that recognizes parts may not be able to (learn to) recognize when those parts have particular relationships, etc. I could go on about this, but the point is this: 1. Yes: systems with compact rules with very few input terms are not good at recognizing patterns which need many inputs. So AI systems restricted to compact rules must be supplemented by NN-like structures. 2. No: the NN-like structures cannot replace the "reasoning systems" of "traditional AI", unless we supply architectures that embody those goal-oriented processes. For example, "annealing" does not replace all other kinds of intelligent heuristic search. A tricky fallacy is to think, "Golly, I have now seen NN's solve a hundred problems in the last five years that 'old AI' couldn't solve. What's wrong with that is (i) you can look at it the other way: let's see NNs learn to solve formal integration problems, or similar problems that involve dissection of descriptions and (ii) many of those problems NNs can solve can also be solved by other kinds of analysis -- and, sometimes in ways that lend themselves to being usable in OTHER situations. In this sense, then, NN solutions, in contrast, tend to be dead ends, simply because what you end up with, after your 100,000 steps of hill-climbing, is an opaque vector of coefficients. You have solved the prob lem, all right. You have even _learned_ the solution! But you don't end up with anything you can THINK about! Is that bad? Your locomotion system "learns" to walk, all right. (It begins with an architecture of NN's that wonderfully work to adjust your reflexes.) But "you" don't know anything of how it's done. Even Professors of Locomotion Science are still working out theories about such things. So may you can make a pretty good dog with NNs. And note that I put NNs in the plural! A dog, or a human, learns by using a brain that consists of (I estimate) some 400 clearly distinctly different NN architectures and perhaps 3000 distinct busses or bundles of specialized interconnections. What does that mean? Answer: some of the job is done by NNs. And some of the job is done by compactly-describable procedural specifications. Where is the "traditional, symbolic, AI in the brain"? The answer seems to have escaped almost everyone on both sides of this great and spurious controversy! The 'traditional AI' lies in the genetic specifications of those functional interconnections: the bus layout of the relations between the low-level networks. A large, perhaps messy software is there before your eyes, hiding in the gross anatomy. Some 3000 "rules" about which sub-NN's should do what, and under which conditions, as dictated by the results of computations done in other NNs (see the idea of "B-brain" in my book). Someone might object that this may be an accident. In a few years, perhaps, someone will find a new learning algorithm through which a single, homogeneous NN (highly cyclic, of course) can start from nothing and learn to become very smart, without any of that higher-level stuff encoded into its anatomy -- and all in some reasonable amount of time. That is the question, and I see no reason to think that present-day results are very encouraging. ----- Here is a simple, if abstract, example of what I mean. Consider one of the most powerful ideas in traditional AI -- the concept of acheiving a goal by detecting differences between the present situation ("what you have") and a target situation ("what you want"). The Newell and Simon 'GPS' system did such things (and worked in many cases, but not all) by trying various experiments and comparing the results, and then applying strategies designed (or learned) for 'reducing' those differences. In order to do this, common sense would suggest, you need resources for storing away the various recent results, and then pulling them out for comparisons. This is easily done with the equivalent of registers, or short-term memories -- and it seems -- from a behavioral viewpoint -- that human brains are equipped with modest numbers of such structures. Now, in fact, no one knows the physiology of this. In "Society of Mind" I conjecture that many of our brain NN's are especially equipped with what I call "temporary K-lines" or "pronomes" that are used for such purposes. (Their activities are controlled by other NN's that somehow learn new control-scripts for managing those short-term memories.) Well, if you design NNs with such facilities, then it will not be very hard to get them to solve symbolic, analytic problems. If you don't provide them with that sort of hardware, everything will get too muddled, and (I predict) they'll "never" get very far. It will be like trying to teach your dog to do calculus. An alternative will be to design a fiendishly clever pre-training scheme which "teaches" your NN, first, to build inside itself some registers. This might indeed be feasible, with a homogeneous NN, under certain conditions. But it wouldn't be exactly a refutation of what I said before, because it would involve, not the NN itself "discovering" an adequate architecture, but an external teacher's deliberately imposing that architecture on the NNs future development. (Even this is not all-or-none, because there is clearly some such trade-off in human development which, according to all accounts, will fail in the absence of any attentive adult caretaker. Oh well. ---------- In any case, I want to thank Loren for endless thoughtful observations about many other topics. I intend to think more about what he said here.