Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!think.com!zaphod.mps.ohio-state.edu!uakari.primate.wisc.edu!aplcen!jhunix!ins_atge From: ins_atge@jhunix.HCF.JHU.EDU (Thomas G Edwards) Newsgroups: comp.ai.neural-nets Subject: Re: State of the Art Feed-Forward Network Training Algorithms Summary: CG etc Message-ID: <8412@jhunix.HCF.JHU.EDU> Date: 18 May 91 03:24:18 GMT References: <1991May17.090435.9180@fwi.uva.nl> Organization: The Johns Hopkins University - HCF Lines: 36 In article greenba@gambia.crd.ge.com (ben a green) writes: >In article <1991May17.090435.9180@fwi.uva.nl> smagt@fwi.uva.nl (Patrick van der Smagt) writes > aj3u@opal.cs.virginia.edu (Asim Jalis) writes: > >What is the state of the art in training feed-forward networks. > I myself haven't used error back-propagation for over a year, but CG > instead. It sizzles. >My implementation of CG trained to 90% on this problem in 1676 presentations >of the training set. That's a factor of 89 faster than backprop. I think it is important to point out that backpropogation refers to a method of developing the error gradient w.r.t the weights. One might use simple gradient descent, steepest descent w. linesearch, CG (which really rocks when properly done), or modified Newton Methods (which can go even faster than CG, but not by a heck of alot). Someone at Oregon Graduate Institute used to have a good CG program avaliable via anonymous ftp. I have used that implementation, and it was exceedingly fast. Yes, you'd find local minima in small problems, but in most normal size problems there were none. I would reccommend that people interested in training NNs look into Cascade-Correlation (Fahlman, TR available on cheops.cis.ohio-state.edu in /pub/neuroprose I believe). It builds up a network with a minimal number of hidden units (relatively minimal, I don't think it is optimally minimal), and all learning is done on a single layer of weights at a time, so no nasty backprop pass. It is exceedingly fast, especially if you use something better than simple gradient descent on the correlation and error minimization. Cascade-Correlation has recently been extended to recurrent nets, and I plan to see how it works on a sun activity predictor over the next 3 months. -Thomas Edwards