Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!shadooby!accuvax.nwu.edu!tank!eecae!cps3xx!cpsvax!artzi From: artzi@cpsvax.cps.msu.edu (Ytshak Artzi - CPS) Newsgroups: comp.ai.neural-nets Subject: Re: Back Propagation question... (follow up) Message-ID: <3216@cps3xx.UUCP> Date: 30 May 89 23:51:56 GMT References: <226@cs.columbia.edu> Sender: usenet@cps3xx.UUCP Reply-To: artzi@cpsvax.UUCP (Ytshak Artzi - CPS) Organization: Michigan State University, Computer Science Department Lines: 42 In article <226@cs.columbia.edu> camargo@cs.columbia.edu (Francisco Camargo) writes: >My problem is that I can find any (theoretical) justification for the "online" >method other that "Robins Monroe algorithm" (I may have misspelled his name, >for which I apologize, but I don't have my references near by). But then, the >"dumping" factor is required for guaranteed convergence. I tried the "online" >method and it does seem to perform better. But, WHY does it work ? How come it >converges so well (despite of making {a_k}=1) ? > As a general comment, you must be careful in choosing the particular instance of the problem you try to solve. If the initial state is close to the correct solution than both methods will work. For any problem there exists an instance for which the convergence is not guaranteed for either method. Unfortunately, there is no good method available to detect such an instance, given an arbitrary problem. Now consider the following equation: DELTA w = n(t - O )i = nd i p ji pj pj pi pj pi This rule changes weights following presentation of I/O pair p. t is target input for j-th component of output pattern p pj O is the j-th element of the actual output pattern, resulted by pj input p i is the i-th input element pi d = t - O pj pj pj DELTA w is the change to be made from the i-th to j-th unit after p ij input p Hope it helps... Itzik.