Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!ucsd!sdcc6!beowulf!pluto From: pluto@beowulf.ucsd.edu (Mark Plutowski) Newsgroups: comp.ai.neural-nets Subject: Re: Back-Propagation Weight Initialization Message-ID: Date: 12 Dec 90 01:14:50 GMT References: <1325@helens.Stanford.EDU> <1990Dec11.091222.3501@neural.dynas.se> Sender: news@sdcc6.ucsd.edu Lines: 32 Nntp-Posting-Host: beowulf.ucsd.edu egel@neural.dynas.se (Peter Egelberg) writes: >The improvement in learning speed sounds fine. But what about generalization. >Does weight initialization improve generalization? . . . >I'm not saying that learning speed is unimportant. I'm saying that >generalization is a greater problem when using neural networks in real world >applications. >-- >Peter Egelberg E-mail: egel@neural.dynas.se >Neural AB Phone: +46 46 11 00 90 >Otto Lindbladsv. 5 >223 65 LUND, SWEDEN True, many people are more concerned with the quality of the fit (that is, the accuracy of generalization, in your terms) than with learning time. IMHO, it is probably the case that the initial choice of weights has a significant effect upon the quality of the fit achieved for a particular learning run, unless we use a learning rule which is insensitive to the initial parameterization of the network function. Gradient descent is not such a learning rule, in and of itself; if complemented with a global search mechanism for searching the parameter space, (say, via genetic algorithms or even a grid search) it can be. -=-= M.E. Plutowski, pluto%cs@ucsd.edu UCSD, Computer Science and Engineering 0114 La Jolla, California 92093-0114