Path: utzoo!censor!geac!torsqnt!news-server.csri.toronto.edu!cs.utexas.edu!wuarchive!mit-eddie!bloom-beacon!eru!hagbard!sunic!dkuug!daimi!fodslett
From: fodslett@daimi.aau.dk (Martin Moller)
Newsgroups: comp.ai.neural-nets
Subject: Re: Scaled Conjugate Gradient (SCG). Preprint soon available.
Message-ID: <1990Nov26.210053.263@daimi.aau.dk>
Date: 26 Nov 90 21:00:53 GMT
References: <1990Nov17.130731.12392@daimi.aau.dk> <3430022@hpwrce.HP.COM>
Sender: fodslett@daimi.aau.dk (Martin Moller)
Organization: DAIMI: Computer Science Department, Aarhus University, Denmark
Lines: 27

>Sounds like the scaled approach may hold promise. How much more cpu does it need
>as the number of training patterns increases? As the number of inputs 
>increases? Does the cpu time increase exponentially? Polynomially? Linearly?

>							Kingsley


Well, I have not got any results on the first question, but concerning the
cpu time versus the number of inputs, there is a good example in the preprint
coming. 

The parity problem is here used as a test example. SCG and BP was 
tested on 3,4,5,6 and 7 bit parity problem using 10 different initial weight
vectors. Three and four layer network architectures was used for each problem.
An average of the total calls of the error and the gradient was used to compare
the performance of the two algorithms.
SCG obtained a speed-up between 18 and 46 relative to BP.

let N be the number of weights and biases in the networks. 
The SCG algorithm was in all runs bounded by O(N**2) and O(NlogN) function
calls, while BP was bounded by O(N**3) and O(N**2logN) function calls.
These results are experimentel results and the bounds does of course
depend on the nature of the task. BP has been reported to show exponential
scaling; such results has not been experienced with SCG.


		Martin