Path: utzoo!utgpu!news-server.csri.toronto.edu!neuron.ai.toronto.edu!radford
Newsgroups: comp.ai.neural-nets
From: radford@ai.toronto.edu (Radford Neal)
Subject: Re: Are Conjugate Gradient algorithms any good?
Message-ID: <91Mar7.145659edt.437@neuron.ai.toronto.edu>
Keywords: NETtalk, Conjugate Gradient algorithms, Back-propagation
Organization: Department of Computer Science, University of Toronto
References: <1991Mar4.142559.21857@daimi.aau.dk> <^9B&5R#@warwick.ac.uk> <pluto.668285404@cornelius>
Date: 7 Mar 91 19:57:23 GMT
Lines: 25

>I was most interested in the report by Denis Anthony, i.e., 
>
>	>... using epoch updates was inferior to pattern updates.

I think this discussion may cause some confusion. From the original
posting, it is clear that what is claimed is that using epoch updates 
is inferior to using pattern updates ** if the criterion for cost is the
number of applications of the network to training cases **. If the
criterion for cost is number of weight changes, then there is no reason
to think epoch updates are inferior, and every reason to think they 
are superior.

This difference is not surprising. If the training set is large and
contains many very similar, maybe even identical, patterns, then applying 
the network to all patterns before changing weights is wasteful. More 
typically, however, the training data is not all that voluminous, and 
omitting some patterns might seriously impair gradient descent unless
the learning rate is set quite low.

In any comparison of learning methods, one must be clear on what the cost 
criterion is, and one must be sure to find the optimal settings for 
the learning rate and other adjustable parameters for each of the
techniques being compared.

    Radford