Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!ucbvax!ucsd!sdcsvax!beowulf!demers From: demers@beowulf.ucsd.edu (David E Demers) Newsgroups: comp.ai.neural-nets Subject: Re: 3-Layer versus Multi-Layer Message-ID: <6730@sdcsvax.UCSD.Edu> Date: 28 Jun 89 18:57:45 GMT References: <3417@cosmo.UUCP> Sender: nobody@sdcsvax.UCSD.Edu Reply-To: demers@beowulf.UCSD.EDU (David E Demers) Organization: EE/CS Dept. U.C. San Diego Lines: 33 In article <3417@cosmo.UUCP> jochenru@cosmo.UUCP (Jochen Ruhland) writes: >During a local meeting here in Germany I heard somebody talking >about a theorem that a three layer perceptron is capable to >perform any given In/Out function with an maximum number of hidden >units in the network. For "perceptrons", there is no such proof, since multilayer linear units can easily be collapsed into two-layers. See, e.g., Minsky & Papert, "Perceptrons" (1969). If, however, units can take on non-linear activations, then it can be shown that a three layer network can approximate any Borel-measurable function to any desired degree of accuracy (exponential in the number of units, however!). Hal White et al have shown this, and have also shown that the mapping is learnable. This paper is going to appear this year in the Journal of INNS, Neural Networks. The source of this is frequently listed as the Kolmogorov superposition theorem. Robert Hecht-Nielsen has a paper in the 1987 Proceedings of the First IEEE conference on Neural Networks about this theorem. The theorem is not constructive, however. It shows that a function from R^m to R^n can be represented by the superposition of {some number linear in m & n} bounded, monotonic, non-linear functions of the m inputs. However, there is no way of determining these functions... I am writing all of this from memory, all of my papers are elsewhere right now... but I know that others have similar results. Dave