Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!ucsd!sdcc6!crl!elman
From: elman@crl.ucsd.edu (Jeff Elman)
Newsgroups: comp.ai.neural-nets
Subject: Re: Connectionist Finite State Machines -- description of an architecture
Keywords: simple recurrent networks
Message-ID: <14611@sdcc6.ucsd.edu>
Date: 1 Dec 90 05:55:45 GMT
References: <7982@uwm.edu>
Sender: news@sdcc6.ucsd.edu
Followup-To: elman@crl.ucsd.edu
Organization: University of California, San Diego
Lines: 44
Nntp-Posting-Host: crl.ucsd.edu

In article <7982@uwm.edu> Mark Hopkins write:
>
>   This is a description of a rather simple architecture that can be used to
>train a neural net to be a finite state machine using only backpropagation.
>

Yes, this is an interesting architecture.  It is a variant of a general
class of networks proposed by Mike Jordan in his 1986 UCSD Ph.D. dissertation.

Several of us have experimented with the architecture you describe.
You might be interested in some reports using this architecture.

I have a 1988 TR called 'Finding structure in time'; a revised
version appeared in the March/April issue of Cognitive Science this year.
This sort network was applied to a variety of domains in which the
task was prediction.  The challenge for the network was to learn the
underlying dynamics which produced the time series.

Servan-Schreiber, Cleeremans, and McClelland report work using the
same architecture (which they called a simple recurrent network)
to predict a time series which was generated by FSA.  Although the
net did a good job of representing the states of certain FSA's, 
certain limitations in the SRN were revealed.

I've used this architecture to predict words in complex sentences
(i.e., sentences with subordinate clauses).  The issue I was interested
in was the ability of the net to model sentences in which there was
(presumably) an underlyingly hierarchical structure--can such networks
represent constituent structure, using distributed representations.
The network did in fact learn to do this, but there turn out to be
interesting and I think important differences between the state
representation of hierarchy, and the more traditional stack
representation.  This work has also been reported in a couple of TR's.

There is actually quite a bit interesting work that's been reported
with this architecture, including work by Mike Jordan, Mike Gasser &
Chan-Do Lee, Mary Hare, Janet Wiles, St. John & McClelland, Risto
Miikkulainen & Mike Dyer, Gary Cottrell & Fu-Sheng Tsung, Bob Port,
Steve Small, among many others (sorry--I've undoubtedly missed
something important!).


Jeff Elman
Cog Sci/UCSD