Path: utzoo!utgpu!news-server.csri.toronto.edu!mailrus!uunet!aplcen!uakari.primate.wisc.edu!unmvax!ariel.unm.edu!hooey.unm.edu!bill
From: bill@hooey.unm.edu (william horne)
Newsgroups: comp.ai.neural-nets
Subject: Re: Observations on the State of NN theory
Keywords: Genetic Neural Training Pepsi
Message-ID: <1990Aug3.175023.28210@ariel.unm.edu>
Date: 3 Aug 90 17:50:23 GMT
References: <spoffojj.649607641@lgn>
Sender: usenet@ariel.unm.edu (USENET News System)
Organization: University of New Mexico, Albuquerque
Lines: 47

In article <spoffojj.649607641@lgn> spoffojj@hq.af.mil (Jason Spofford) writes:
>I would like to hear some reactions to the following generalizations:
> .......
>2.  Several training algorithms have been developed. Each algorithm
>works on a small subset of NN architectures, usually with a particular
>neuron model. The training algorithms, even when all combined, only
>use a small percentage of the possible NN architectures.
> .......
>5. Each training algorithm can only solve a narrow class of problems. 
>
>As you may of gathered from my previous post, I am working on applying
>the genetic algorithm to solving NN problems. My hope, and that is
>what it is at this point, is to develop a GA that makes no assumptions
>or restrictions on NN architectures and that can solve a wide class of
>problems. I'd like to think I'm attacking the metaproblem of NN's,
>artificially developing NN's in a way not too unlike biological
>systems.
>

Here's my $0.02.....

I don't think GAs have much to offer for learning techniques in networks
which have a good gradient search technique for learning (i.e.
MLPs, recurrent networks, etc...), and especially when these networks
use floating point weight representations.

The learning algorithm for these networks can always be cast in 
terms of minimizing some criterion function, and as a result can be viewed
as a search of an error surface in the weight space.  My experience with
GAs have been that they are terrible at searching the bizarre error surfaces
associated with something like MLPs, in fact they are no better than a
completely random search.  This seems to be due to the fact that the bits
in floating point representations are highly correlated with each other.
There are things you can do to avoid this, like Grey coding and not allowing
crossovers in the middle of a 32-bit word, etc...  These algorithms seem
to improve the performance of the GA, but not to the point where they are
competitive with a simple gradient search.

I always thought GAs were fine if your search space consisted of attributes
which are binary and not highly correlated.  I don't see them as particularly
appropriate for learning algorithms for these types of networks.  Maybe
they are good for other types of networks I haven't considered closely.
In any case I don't think they are the global solution to NN learning.

Feel free to flame this...

-Bill