Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!ucsd!sdcc6!crl!elman From: elman@crl.ucsd.edu (Jeff Elman) Newsgroups: comp.ai.neural-nets Subject: Re: Connectionist Finite State Machines -- description of an architecture Keywords: simple recurrent networks Message-ID: <14611@sdcc6.ucsd.edu> Date: 1 Dec 90 05:55:45 GMT References: <7982@uwm.edu> Sender: news@sdcc6.ucsd.edu Followup-To: elman@crl.ucsd.edu Organization: University of California, San Diego Lines: 44 Nntp-Posting-Host: crl.ucsd.edu In article <7982@uwm.edu> Mark Hopkins write: > > This is a description of a rather simple architecture that can be used to >train a neural net to be a finite state machine using only backpropagation. > Yes, this is an interesting architecture. It is a variant of a general class of networks proposed by Mike Jordan in his 1986 UCSD Ph.D. dissertation. Several of us have experimented with the architecture you describe. You might be interested in some reports using this architecture. I have a 1988 TR called 'Finding structure in time'; a revised version appeared in the March/April issue of Cognitive Science this year. This sort network was applied to a variety of domains in which the task was prediction. The challenge for the network was to learn the underlying dynamics which produced the time series. Servan-Schreiber, Cleeremans, and McClelland report work using the same architecture (which they called a simple recurrent network) to predict a time series which was generated by FSA. Although the net did a good job of representing the states of certain FSA's, certain limitations in the SRN were revealed. I've used this architecture to predict words in complex sentences (i.e., sentences with subordinate clauses). The issue I was interested in was the ability of the net to model sentences in which there was (presumably) an underlyingly hierarchical structure--can such networks represent constituent structure, using distributed representations. The network did in fact learn to do this, but there turn out to be interesting and I think important differences between the state representation of hierarchy, and the more traditional stack representation. This work has also been reported in a couple of TR's. There is actually quite a bit interesting work that's been reported with this architecture, including work by Mike Jordan, Mike Gasser & Chan-Do Lee, Mary Hare, Janet Wiles, St. John & McClelland, Risto Miikkulainen & Mike Dyer, Gary Cottrell & Fu-Sheng Tsung, Bob Port, Steve Small, among many others (sorry--I've undoubtedly missed something important!). Jeff Elman Cog Sci/UCSD