Path: utzoo!attcan!uunet!cs.utexas.edu!tut.cis.ohio-state.edu!purdue!tippy!sawmill!mdbs!zed
From: zed@mdbs.uucp (Bill Smith)
Newsgroups: comp.ai
Subject: Re: What Has Traditional AI Accomplished?
Message-ID: <1990Oct27.200050.1121@mdbs.uucp>
Date: 27 Oct 90 20:00:50 GMT
References: <69609@lll-winken.LLNL.GOV> <1990Oct15.143325.26044@unislc.uucp> <1990Oct16.135631.6444@cbnewsj.att.com> <6664@jhunix.HCF.JHU.EDU>
Reply-To: zed@mdbs.UUCP (Bill Smith)
Organization: mdbs, Inc.
Lines: 119

In article <6664@jhunix.HCF.JHU.EDU> ins_atge@jhunix.HCF.JHU.EDU (Thomas G Edwards) writes:
>In article <1990Oct16.135631.6444@cbnewsj.att.com> jwi@cbnewsj.att.com (Jim Winer @ AT&T, Middletown, NJ) writes:
>>Keith L. Breinholt writes:
>>
>>| Someone correct me if I'm wrong, I though Neural Nets as an area of
>>| study was only 5 or so years old.  In terms of research, 5 years is
>>| baby technology.  If Neural Nets are consistent with other research it
>>| won't make it into general public acceptance for another 5 to 10
>>| years.
And God is a 2 year old.
>>
>>I worked on the Mark I Perceptron (Rosenblatt model) in 1959 
>>at Cornel Aeronautical Laboratories, Inc. (defunct) under contract
>>to Office of Naval Research (ONR). That makes the field at least
>>30 years old. Neural Nets have been inconvenient to work with until 
>>recently when specialized hardware has become available.
Inconvenient for Vulcan's perhaps....  But they're imaginary like the
rest of the Artificial Intelligent bullshit
sh*t.
>
>Actually, the death of neural nets in the late sixties and the rebirth of
>them a few years ago is a complex story.  
Death is a myth.  However, if one is dead to God, tough luck asshole.
Ask Ken Forbus.
>Adalines, Perceptrons, and 
>similar two-layer neural systems were developed, and actually proved
>useful in limited was for signal processing.  
Proof is a myth.  If proof is required, the idea is too complex for
even the simplest of electrons to understand.  If an electron can't
understand, how do you expect him (or her) (or it) (or they) (or Jeff, the
electron's real name) (the reason they are all the same is they are all
Jeff.  Every material thing that normal lunkhead's deal with is made of
Jeff, Joyce, Bruce and the two kids David and Ginger.  I should know,
they are my cousins.
>The big limitation was
>that with two feedforward layers of step-function or sigmoidal activation
>functions, mappings from input to output could only be developed which
>include areas divided by a single curve in the input space (i.e. 
>functions like exclusive-OR could not be represented by the structure).
Speak english not vulcan.  Vulcan is the language of Hell, which is a
fine thing, since that's where you're destined.
>It was fairly obvious from very early neural models that "hidden layers,"
>were required between the input and output neural layers.  
A hidden layer is a non-existent layer.
>  Now, the perceptron learning rule was developed by agreeing on an error
>function to be minimized (usually the sum of squares of differences between
>actual outputs and desired outputs).  
*The* perceptron (Ted) (the only one, you know) is a hippie.  He's willing
to do anything if it looks fun.
>Training was done by moving along
>the negative gradient of this error function, thus (usually) minimizing it.
Ted is one smart guy.  I never knew it until you said this.
>However, while it is fairly obvious how to differentiate the error function
>for a two-layer net, no one could work out how to differentiate the
>error function for multiple layers.  
One should learn how to differenitate d(t)*e(d(t)) first.  (no spelling
errors).
>Marvin Minsky made some comments on
>the difficulty of this in _Perceptrons_, and alot of people lost interest
>in these models.
Whoa!  Who is this Marvin Dude that he thinks he can write a biography of
Ted, who hasn't even been born yet....  
(Well, maybe he's been born, but he hasn't become rich and famous like
he deserves.)
>   Eventually someone worked out how to find the error function gradient
>for multiple layer networks.  
And, pray tell, does it involve complex arithmetic?
>It really isn't that hard to do, and I
>don't understand what was so difficult about it.  
Difficulty is like religion, it's a cult of the foolish.
>I guess the difficult
>concept was passing error back from the output layer to the hidden layer,
>and prudent use of the chain rule.  
Have you guy's ever studied EE?  This is called "Systems Theory" and
is trivial to any graduate EE who's understood the course.
>Really, I wonder why it took so long
>to work out.  
Probably because they were using a pencil instead of a lavatory.
>Actually, I have a feeling some people did work it out in
>the seventies, but after _Perceptrons_ perhaps people were just turned off
>by NNs.  
I'm turned off by MM's, but then I'm just wierd. (or is that wired?)
>  Finally with the publication of _Parallel_Distributed_Processing_,
>everyone saw how easy it was to program a multi-layer perceptron,
>and other NN structures such as Boltzman Machines.  At first, however,
>mathematical failure of NN researchers #2 happened:  fixed step size
>gradient descent wass used.  Anyone from mathematical sciences can tell
>you that this is a silly way to minimize a function, and learning 
>speedups of several orders of magnitude can easily be achieved with
>conjugate-gradient and other more advanced minimization methods.
>Thus people were lead to believe that even for very small problems,
>NNs were slow, when infact they really are not.
>  Now even recurrent neural networks can be trained, allowing NNs to have
>temporal behavior.  
>  But NN researchers are beginning to realize that training a big
>homogeneous network is not the answer to good learning systems.
>Modularlization is required.  Cascade-Correlation is a NN algorithm
>which develops feature representations which can best help to reduce
>the network error, and then these features are used to minimize the
>network error.  It is able to solve many problems which were difficult
>for homogenous NNs to solve.
>  I see a future where inductive learning by small homogeneous NNs
>is used in combination with more traditional AI type goal building.
>Cascade-Correlation is a step in that direction.  Divide-and-conquer
>of traditional AI is combined with the easy inductive learning of
>traditional NNs.  Of course, the trick is to couch this in a
>connectionist framework to continue to allow for fast parallel 
>computation.
>
>-Thomas Edwards

Why do I waste my time with you people.   Take some EE.  Read some poetry.
Try as you might, you won't understand it.  Not only that, Vogon's *write*
better poetry.  I guess that means that you are all just really the
same thing: Cynthia Fitzmelton.  What a poor woman.  She knew the answer
and then the Vogon's destroyed the planet.  They were just jealous because
they found out they couldn't write the *worst* poetry after all.

God (Obviously, I am lying)