Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!shadooby!accuvax.nwu.edu!tank!eecae!cps3xx!cpsvax!artzi
From: artzi@cpsvax.cps.msu.edu (Ytshak Artzi - CPS)
Newsgroups: comp.ai.neural-nets
Subject: Re: Back Propagation question... (follow up)
Message-ID: <3216@cps3xx.UUCP>
Date: 30 May 89 23:51:56 GMT
References: <226@cs.columbia.edu>
Sender: usenet@cps3xx.UUCP
Reply-To: artzi@cpsvax.UUCP (Ytshak Artzi - CPS)
Organization: Michigan State University, Computer Science Department
Lines: 42

In article <226@cs.columbia.edu> camargo@cs.columbia.edu (Francisco Camargo) writes:
>My problem is that I can find any (theoretical) justification for the "online"
>method other that "Robins Monroe algorithm" (I may have misspelled his name, 
>for which I apologize, but I don't have my references near by). But then, the
>"dumping" factor is required for guaranteed convergence. I tried the "online"
>method and it does seem to perform better. But, WHY does it work ? How come it
>converges so well (despite of making {a_k}=1) ?
>
   As a general comment, you must be careful in choosing the particular
instance of the problem you try to solve. If the initial state is close
to the correct solution than both methods will work. For any problem
there exists an instance for which the convergence is not guaranteed for
either method. Unfortunately, there is no good method available to
detect such an instance, given an arbitrary problem.

  Now consider the following equation:

       DELTA w   = n(t   - O  )i   = nd  i
            p ji      pj    pj  pi     pj pi

  This rule changes weights following presentation of I/O pair p.

  t   is target input for j-th component of output pattern p
   pj

  O   is the j-th element of the actual output pattern, resulted by
   pj
       input p

  i   is the i-th input element
   pi

  d   = t   - O
   pj    pj    pj

  DELTA w   is the change to be made from the i-th to j-th unit after
       p ij
            input p

  Hope it helps...

   Itzik.