Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!columbia!cs!camargo
From: camargo@cs.columbia.edu (Francisco Camargo)
Newsgroups: comp.ai.neural-nets
Subject: Back Propagation Algorithm question...
Message-ID: <224@cs.columbia.edu>
Date: 29 May 89 23:26:49 GMT
Organization: Columbia University Department of Computer Science
Lines: 27


Can anyone put some light in the following issue:

How should one compute the weight adjustments in BackProp ?
From reading PDP, one gathers the impression that the DELTAS
should be acumulated over all INPUT PATTERNS and only then
a STEP is taken towards the gradient. Robins Monroe suggests
a stochastic algorithm with proved convergency if one takes one
step at each pattern presentation, but dumps its effect by a factor
1/k where "k" is the presentation number. Other people,(from codes
that I've seen flying around) seems to take a STEP a each presentation
a don't take into account any dumping factors. I've tried myself both
approaches and they all seem to work. After all, which is the correct way
of adjusting the weights ? Acumulate the errors over all patterns ? Or, work
towards the minimum as new patterns are presented. Which are the implications ?

Any light is this issue is extremelly appreciated.

Francisco A. Camargo
CS Department - Columbia University
camargo@cs.columbia.edu


PS: A few weeks ago, I requested some pointers to Learning Algorithms in NN
and promissed a summary of the replies. It is comming. I have not forgoten my
responsibilities with this group. Even  though I got more requests than really
new info, I'll have a summary posted shortly. And thanks for all who
contributed.