Path: utzoo!utgpu!news-server.csri.toronto.edu!neuron.ai.toronto.edu!radford Newsgroups: comp.ai.neural-nets From: radford@ai.toronto.edu (Radford Neal) Subject: Re: Are Conjugate Gradient algorithms any good? Message-ID: <91Mar7.145659edt.437@neuron.ai.toronto.edu> Keywords: NETtalk, Conjugate Gradient algorithms, Back-propagation Organization: Department of Computer Science, University of Toronto References: <1991Mar4.142559.21857@daimi.aau.dk> <^9B&5R#@warwick.ac.uk> Date: 7 Mar 91 19:57:23 GMT Lines: 25 >I was most interested in the report by Denis Anthony, i.e., > > >... using epoch updates was inferior to pattern updates. I think this discussion may cause some confusion. From the original posting, it is clear that what is claimed is that using epoch updates is inferior to using pattern updates ** if the criterion for cost is the number of applications of the network to training cases **. If the criterion for cost is number of weight changes, then there is no reason to think epoch updates are inferior, and every reason to think they are superior. This difference is not surprising. If the training set is large and contains many very similar, maybe even identical, patterns, then applying the network to all patterns before changing weights is wasteful. More typically, however, the training data is not all that voluminous, and omitting some patterns might seriously impair gradient descent unless the learning rate is set quite low. In any comparison of learning methods, one must be clear on what the cost criterion is, and one must be sure to find the optimal settings for the learning rate and other adjustable parameters for each of the techniques being compared. Radford