Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!van-bc!ubc-cs!fornax!mcguire
From: mcguire@fornax.UUCP (Michael McGuire)
Newsgroups: comp.ai.neural-nets
Subject: BP input scaling, normalization
Keywords: BP, scaling
Message-ID: <2533@fornax.UUCP>
Date: 19 Apr 91 18:15:47 GMT
Distribution: na
Organization: School of Computing Science, SFU, Burnaby, B.C. Canada
Lines: 32

I have been using back-propagation to combine two sets of 11 parameters 
(22 inputs) into 11 output classes (there are 275 training patterns and 275 
test patterns). Therefore the net has 22 inputs, 11 outputs and possibly some
hidden layers. The inputs for each set were scaled by a respective constant
so that the input values were in the range 0 to 1 (this was a requirement of
the BP software).
My questions arise from the results I obtained:
	1. Different scaling constants resulted in very different
	   classification performances.
	2. A network with no hidden-layers outperformed nets with 1 hidden
	   layer (both nets had near perfect classification on the training
	   patterns).

Questions:
	1. What are the effects of scaling the inputs to a BP net and is
	   there an optimal way to do this (especialy since I have 2 sets of
	   inputs that need to be scaled differently).
	2. Why would a single-layer net outperform a two-layer net (2-layer
	   net only had 5 hidden units). I would expect the two-layer net to
	   at least do as well.
	3. Do output activations of 0.1 and 0.9 (as opposed to 0.0 and 1.0)
	   help the generalization process.
	4. Is there a different neural net better suited to this type of
	   classification (Radial Basis Functions?).

Thanks in advance to all those who respond.

Mike McGuire
Engineering Science
Simon Fraser University
Canada
e-mail: mcguire@cs.sfu.ca