Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!mailrus!uunet!mcsun!ukc!reading!minster!russell
From: russell@minster.york.ac.uk
Newsgroups: comp.ai.neural-nets
Subject: Re: Really smart systems
Keywords: Complexity, interesting.
Message-ID: <650892116.17086@minster.york.ac.uk>
Date: 17 Aug 90 11:21:56 GMT
References: <3430007@hpwrce.HP.COM>
Reply-To: russell@uk.ac.york.minster  (russell)
Organization: Department of Computer Science, University of York, England
Lines: 56

In article <3430007@hpwrce.HP.COM> kingsley@hpwrce.HP.COM (Kingsley Morse) writes:
>I'll post some
>computational complexity figures for some common algorithms. If you add to
>(correct?) these, please do. The terminology that I propose is that 
>linear computational complexity is better than polynomial which is
>better than exponential. Furthermore, the algorithms' scalability must be
>measured with respect to learning, recall, patterns, and dimensions.
>
>
>Algorithm          Learning                   Recall
>             patterns  dimensions      patterns     dimensions
>------------------------------------------------------------
>Backprop    polynomial     ?           independent  linear
>
>Nearest
>   neighbor linear      linear         linear       linear
>
>Cart        nlogn       exponential    independent      ?
>
>

Well, if you want tuppeny-worth's thrown in, here's mine.

Backprop has NO guarenteed convergence, therefore to quote computational complexity
is misleading.  It may be possible to give an "average case" complexity figure, but
this is so dependent on the initial weight settings as to be not too much use.
Tesauro and Janssens report empirical results between learning time and predicate
order (q) of patterns.  The net has q inputs, 2q nodes in the first layer, fully
connected, and 1 output node.  The task set is the parity function on the q bits.
Leaning times in b-p scale as approx. 4^q.  The task set is of size 2^q so the
training time is about 2^q.  Conclusion is that empirically learning time is of
exponential order.

The perceptron scales as exponential in input size (Hampson and Volper 1986).

(See Neural Network Design and the Complexity of Learning by Judd for more.....I
aint read it all yet, but it's interesting so far.....)

I confess to not understanding quite what "Cart" means in the above table - I can
make an educated guess, however.....

To add another algorithm to the list, the ADAM system, based on the Willshaw net
(i.e. a distributed associative memory) (Austin 1987) has complexity as follows:

Algorithm          Learning                   Recall
	     patterns  dimensions      patterns     dimensions
------------------------------------------------------------
ADAM	    linear    linear(quadratic)	indep.	    linear(nlogn)

brackets refer to abnormal, but permissible, parameterisation. (Beale 1990).

Note that recall may take longer than teaching - but also note that the realm of
use of ADAM means that the multiplicative constants in front of the order terms are
extremely small.

Russell.