Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!usc!jarthur!nntp-server.caltech.edu!tylerh From: tylerh@nntp-server.caltech.edu (Tyler R. Holcomb) Newsgroups: comp.ai.neural-nets Subject: Re: generalization in NN's Keywords: ldf generalization Message-ID: <1991Apr3.185120.25801@nntp-server.caltech.edu> Date: 3 Apr 91 18:51:20 GMT References: <1991Apr2.205240.24668@milton.u.washington.edu> Organization: California Institute of Technology, Pasadena Lines: 37 nealiphc@milton.u.washington.edu (Phillip Neal) writes: >I have a problem with the ability of a neural net to generalize. >I have 600 observations of a 6 predictor variable input vector >to classify these observations into 1 of 4 groups. (stuff deleted) >So, what's the deal ? Is my sample size too small ? Are there >any good papers that cover this kind of problem ? >I know I am violating the rule of thumb to have 10 times more >training data than nodes in the net. But hey, data is expensive. Several Comments. 1. I agree with Andy Bereson. A single linear unit will recover the MSE optimal linear discriminant. This would seem to encourage trying to use few hidden units. In particular, use linear feed through connections (direct connections from input to output). The theoretical benefit and practical utility of this approach has been demonstrated by many authors (myself included). 2. Kramer has shown that the underlying structure of a backprop net is ill-suited for classification tasks like yours. He suggests Radial Basis Function Nets ( eg. Moody and Darken, _Neural COmputation_, vol 1, pp 281-294). 3. Neural Networks are not the solution to all of the world's problems. Maybe your problem really is optimally seperated by a linear discriminant! -- ------------------------------------------------------------ Tyler Holcomb * "Remember, one treats others with courtesy and repsect * tylerh@juliet * not because they are gentlemen or gentlewomen, but * caltech.edu * because you are." -Garth Henrichs *