Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!cs.utexas.edu!uunet!zephyr.ens.tek.com!gvgpsa!calsci!jon From: jon@calsci (Parallax & Red Shift) Newsgroups: comp.ai.neural-nets Subject: Re: Dynamic range of nodes Message-ID: <0022@calsci> Date: 28 Dec 90 04:58:38 GMT Reply-To: calsci!jon@gvgpsa.gvg.tek.com (Parallax & Red Shift) Followup-To: comp.ai.neural-nets Organization: California Scientific Software Lines: 91 News-Software: FSUUCP 1.1 Release 5 In article <1990Dec22.042610.23800@aplcen.apl.jhu.edu>, simonof@aplcen (Simonoff Robert 301 540 1864) writes: >In article <8513@uwm.edu> markh@csd4.csd.uwm.edu (Mark William Hopkins) writes: >>In article <1990Dec21.010536.17034@aplcen.apl.jhu.edu> simonof@aplcen.apl.edu (Simonoff Robert 301 540 1864) writes: >>>I have choseen as my new activation function the hyperbolic >>>tangent function which is defined from [-1.0, 1.0]. The >>>derivative of this function is: >>> 2 1 >>>tanh'(X) == sech (X) == --------- >>> 2 >>> cosh (X) >> >>... = 1 - (tanh(x))^2. ... >> >>(A code fragment was presented with the question "what's wrong with it?") >> [some stuff deleted to save bandwith] >>and >> >>> delta1[pattern][i] = sum * 1.0/(cosh(out1[pattern][i])* >>> cosh(out1[pattern][i])); >> >>should be >> delta1[pattern][i] = sum * (1 - out1[pattern][i]*out1[pattern][i]); > >Why is delta1[pattern][i] = sum*(1-out1[pattern][i]*out1[pattern][i]) ? >My activation function is tanh(netinput) and I believe the >derivative of hyperbolic tanget is: > > tanh'(x) = 1/sech(x)**2 = 1/cosh(x)**2 = 2/(e**x + e**(-x)) > >Maybe I am not seeing the algebra that makes: > > 2 1 > --------------- = ------------------ + -1 > x -x -x -x > e + e (1 + e ) (1 + e ) > >Bob Simonoff >simonof@aplcen.apl.edu > Bob, look again at the equation for tanh'(x) you wrote, above. First off, tanh'(x) doesn't equal 1/sech(x)**2, but rather tanh'(x) = sech(x)**2. (This was *probably* just a typo, as you give the correct equation at the top of your original posting.) Continuing on to the 2nd '=' in your tanh'(x) eq., tanh'(x) is, in fact, equal to 1/cosh(x)**2, as you have noted, but you blow it on the 3rd equals sign in the above equation. 1/cosh(x)**2 is NOT equal to 2/(e**x + e**(-x)), but rather is equal to TWICE that quantity: 1/cosh(x)**2 = 4 / (e**x + e**(-x)). Similarly, I don't know where you got the r.h.s. of the next equation. Mark Hopkins suggested that delta1[pattern][i] = sum*(1-out1[pattern][i]*out1[pattern][i] ) This is, in fact, correct. But assuming a transfer function of tanh(x), then this doesn't equal what you wrote, i.e. 1 1 - tanh(x)**2 does NOT equal ------------------ + -1 -x -x (1 + e ) (1 + e ) In fact, e**(2x) + e**(-2x) - 2 1 - tanh(x)**2 = 1 - ---------------------- = 1/cosh(x)**2 = sech(x)**2 e**(2x) + e**(-2x) + 2 Which is the correct value for tanh'(x), as noted above. On the other hand, this appears to be equivalent to your actual code fragment. I assume Mark suggested the alternate form for reasons of computational efficiency (so you don't waste time computing the additional coshines, but rather use the outputs which you already have laying around). But the code you wrote SHOULD work (albeit slower than necessary). So, I would suggest that you have a different problem. (Unless the additional cosh(out1[pattern][i]) calculation loses more precision than your algorithm can tolerate, in which case switching to Mark's formulation will fix that problem, as well as improving the speed!) BTW, I wrote the engine for the commercially available back-prop based neural-net software package called BrainMaker(tm). (Perhaps you've heard of it?) So I've done more than my share of this kind of coding. (I.e., I'm not just talking through my hat... :-) Good luck, Jon --- Jon "J.D." "Parallax" Hartzberg, GSXR pilot '86 GSXR1100 "Red Shift" Cal. Sci. Software, Grass Valley CA DoD #0220 '80 CB750F "Ol' Flexible" jon@calsci.gvgpsa.gvg.tek.com OR ...!calsci!jon '81 XL185S "SquirtintheDirt" "When you stop falling down, you stop learning." -Kenny Roberts "When I found out what fairings cost, I decided I'd learned enough!" -me Disclaimer: If my boss knew I was doing this he'd kill me.