Path: utzoo!utgpu!news-server.csri.toronto.edu!clyde.concordia.ca!uunet!mcsun!ukc!reading!cf-cm!cybaswan!eeandrew
From: eeandrew@cybaswan.UUCP (e c andrews)
Newsgroups: comp.ai.neural-nets
Subject: Re: Backprop Training
Keywords: neural networks, backpropagation, training
Message-ID: <1944@cybaswan.UUCP>
Date: 14 Aug 90 13:32:19 GMT
References: <1331@winnie.fit.edu>
Reply-To: eeandrew@pyr.swan.ac.uk (e c andrews)
Lines: 19

In article <1331@winnie.fit.edu> dfausett@zach.fit.edu ( Donald W. Fausett) writes:
>	The reason that bipolar (-1,+1) data is better for training than
>binary (0,1) data is that no learning occurs on a connection when its input
>signal is zero.  It is easy to see the reason for this.  During the
>backpropagation phase, the delta error term for a unit is multiplied by
>the input signal to that unit in order to compute the update for the
>weight on that connection......

Also, the magnitude of the value can have an influence on your
training: if you non-linearities go 0->1 but your inputs are 0->100 (or
-50->+50) then the weight space is warped so that adaption will occur
mainly in the first layer of weights.

I must admit that in my work (speech processing) we haven't seen much
difference between performance using either mono- or bi-polar
non-linearities, but my inputs are reals -1->+1 so I use that for my
thresholding function. I haven't seen anything published either way.

Eddy Andrews.