Path: utzoo!utgpu!news-server.csri.toronto.edu!mailrus!uunet!aplcen!uakari.primate.wisc.edu!unmvax!ariel.unm.edu!hooey.unm.edu!bill From: bill@hooey.unm.edu (william horne) Newsgroups: comp.ai.neural-nets Subject: Re: Observations on the State of NN theory Keywords: Genetic Neural Training Pepsi Message-ID: <1990Aug3.175023.28210@ariel.unm.edu> Date: 3 Aug 90 17:50:23 GMT References: Sender: usenet@ariel.unm.edu (USENET News System) Organization: University of New Mexico, Albuquerque Lines: 47 In article spoffojj@hq.af.mil (Jason Spofford) writes: >I would like to hear some reactions to the following generalizations: > ....... >2. Several training algorithms have been developed. Each algorithm >works on a small subset of NN architectures, usually with a particular >neuron model. The training algorithms, even when all combined, only >use a small percentage of the possible NN architectures. > ....... >5. Each training algorithm can only solve a narrow class of problems. > >As you may of gathered from my previous post, I am working on applying >the genetic algorithm to solving NN problems. My hope, and that is >what it is at this point, is to develop a GA that makes no assumptions >or restrictions on NN architectures and that can solve a wide class of >problems. I'd like to think I'm attacking the metaproblem of NN's, >artificially developing NN's in a way not too unlike biological >systems. > Here's my $0.02..... I don't think GAs have much to offer for learning techniques in networks which have a good gradient search technique for learning (i.e. MLPs, recurrent networks, etc...), and especially when these networks use floating point weight representations. The learning algorithm for these networks can always be cast in terms of minimizing some criterion function, and as a result can be viewed as a search of an error surface in the weight space. My experience with GAs have been that they are terrible at searching the bizarre error surfaces associated with something like MLPs, in fact they are no better than a completely random search. This seems to be due to the fact that the bits in floating point representations are highly correlated with each other. There are things you can do to avoid this, like Grey coding and not allowing crossovers in the middle of a 32-bit word, etc... These algorithms seem to improve the performance of the GA, but not to the point where they are competitive with a simple gradient search. I always thought GAs were fine if your search space consisted of attributes which are binary and not highly correlated. I don't see them as particularly appropriate for learning algorithms for these types of networks. Maybe they are good for other types of networks I haven't considered closely. In any case I don't think they are the global solution to NN learning. Feel free to flame this... -Bill