Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sun-barr!newstop!exodus!hanami.Eng.Sun.COM!landman
From: landman@hanami.Eng.Sun.COM (Howard A. Landman)
Newsgroups: comp.ai.neural-nets
Subject: Re: Neural Nets with continuous valued outputs.
Keywords: opt
Message-ID: <1669@exodus.Eng.Sun.COM>
Date: 24 Oct 90 02:37:37 GMT
References: <6327@rnd.GBA.NYU.EDU>
Sender: news@exodus.Eng.Sun.COM
Organization: Sun Microsystems, Mt. View, Ca.
Lines: 46

In article <6327@rnd.GBA.NYU.EDU> hjohar@rnd.GBA.NYU.EDU (unknown) writes:
>Does anyone have any references on designing neural-nets that provide
>continuous valued outputs? I've only seen papers on classifiers.

I have a similar problem.  I've been playing around with the "opt"
program from OGC which does conjugate gradient optimization for
training, but it also assumes a classifier.

Some data that may or may not be of interest.  I wanted to train a
net to play the game of Go, using no particular assumptions on how
to do that.  I have 300,000 some odd moves of pro games available,
so my first thought was to have one training sample per move.

Unfortunately, the storage requirement was outrageous.  Even though
all my move data is less than 1 MB, the training file required 362
floating point numbers per training sample, and the %f format used
meant that even 0 had take at least 4 chars ("0.0 "), so the total
training file was around half a gigabyte.  I was able to shrink this
a little by getting the program to use %g format (so 0 could be "0 "),
but it still took around 300 MB on disk, and (more importantly) more
than that in virtual memory (362 * 4 * 300000 = ~400 MB for single
precision, twice that for double precision).  Not too many idle
machines around here have that kind of swap space - my own
workstation has only 70 MB swap for example.  So for really large
training sets, I think any program needs to be architected so that
the training data does not have to be memory resident.

Eventually I settled on running a smaller sample from the above data.
I selected 18700 training samples randomly.  This only requires 48 MB
(measured) of VM to run.  But it still takes about 12 CPU hours to run
one training cycle on a Sun 4/260, and the program does not save results
after each training cycle (although I plan to fix that), so it's very
easy to run for several days and then lose everything if you have a
power outage.

I'm forced to conclude that running 300,000 training samples on 1500
neurons for the several hundred cycles it may take to get good
convergence is not really practical without a dedicated supercomputer.
Kind of disappointing, actually.

Does anyone have any suggestions of free (public domain) systems that
can handle these sizes of data and networks without running all year?

--
	Howard A. Landman
	landman@eng.sun.com -or- sun!landman