Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sun-barr!newstop!exodus!hanami.Eng.Sun.COM!landman From: landman@hanami.Eng.Sun.COM (Howard A. Landman) Newsgroups: comp.ai.neural-nets Subject: Re: Neural Nets with continuous valued outputs. Keywords: opt Message-ID: <1669@exodus.Eng.Sun.COM> Date: 24 Oct 90 02:37:37 GMT References: <6327@rnd.GBA.NYU.EDU> Sender: news@exodus.Eng.Sun.COM Organization: Sun Microsystems, Mt. View, Ca. Lines: 46 In article <6327@rnd.GBA.NYU.EDU> hjohar@rnd.GBA.NYU.EDU (unknown) writes: >Does anyone have any references on designing neural-nets that provide >continuous valued outputs? I've only seen papers on classifiers. I have a similar problem. I've been playing around with the "opt" program from OGC which does conjugate gradient optimization for training, but it also assumes a classifier. Some data that may or may not be of interest. I wanted to train a net to play the game of Go, using no particular assumptions on how to do that. I have 300,000 some odd moves of pro games available, so my first thought was to have one training sample per move. Unfortunately, the storage requirement was outrageous. Even though all my move data is less than 1 MB, the training file required 362 floating point numbers per training sample, and the %f format used meant that even 0 had take at least 4 chars ("0.0 "), so the total training file was around half a gigabyte. I was able to shrink this a little by getting the program to use %g format (so 0 could be "0 "), but it still took around 300 MB on disk, and (more importantly) more than that in virtual memory (362 * 4 * 300000 = ~400 MB for single precision, twice that for double precision). Not too many idle machines around here have that kind of swap space - my own workstation has only 70 MB swap for example. So for really large training sets, I think any program needs to be architected so that the training data does not have to be memory resident. Eventually I settled on running a smaller sample from the above data. I selected 18700 training samples randomly. This only requires 48 MB (measured) of VM to run. But it still takes about 12 CPU hours to run one training cycle on a Sun 4/260, and the program does not save results after each training cycle (although I plan to fix that), so it's very easy to run for several days and then lose everything if you have a power outage. I'm forced to conclude that running 300,000 training samples on 1500 neurons for the several hundred cycles it may take to get good convergence is not really practical without a dedicated supercomputer. Kind of disappointing, actually. Does anyone have any suggestions of free (public domain) systems that can handle these sizes of data and networks without running all year? -- Howard A. Landman landman@eng.sun.com -or- sun!landman