Path: utzoo!utgpu!news-server.csri.toronto.edu!clyde.concordia.ca!uunet!mcsun!ukc!reading!cf-cm!cybaswan!eeandrew From: eeandrew@cybaswan.UUCP (e c andrews) Newsgroups: comp.ai.neural-nets Subject: Re: Backprop Training Keywords: neural networks, backpropagation, training Message-ID: <1944@cybaswan.UUCP> Date: 14 Aug 90 13:32:19 GMT References: <1331@winnie.fit.edu> Reply-To: eeandrew@pyr.swan.ac.uk (e c andrews) Lines: 19 In article <1331@winnie.fit.edu> dfausett@zach.fit.edu ( Donald W. Fausett) writes: > The reason that bipolar (-1,+1) data is better for training than >binary (0,1) data is that no learning occurs on a connection when its input >signal is zero. It is easy to see the reason for this. During the >backpropagation phase, the delta error term for a unit is multiplied by >the input signal to that unit in order to compute the update for the >weight on that connection...... Also, the magnitude of the value can have an influence on your training: if you non-linearities go 0->1 but your inputs are 0->100 (or -50->+50) then the weight space is warped so that adaption will occur mainly in the first layer of weights. I must admit that in my work (speech processing) we haven't seen much difference between performance using either mono- or bi-polar non-linearities, but my inputs are reals -1->+1 so I use that for my thresholding function. I haven't seen anything published either way. Eddy Andrews.