Path: utzoo!censor!geac!torsqnt!news-server.csri.toronto.edu!bonnie.concordia.ca!thunder.mcrcim.mcgill.edu!snorkelwacker.mit.edu!spool.mu.edu!sdd.hp.com!usc!jarthur!ucivax!orion.oac.uci.edu!ucsd!sdcc6!beowulf!whart From: whart@beowulf.ucsd.edu (Bill Hart) Newsgroups: comp.ai.neural-nets Subject: New TR Available Message-ID: Date: 20 Feb 91 06:36:22 GMT Sender: news@sdcc6.ucsd.edu Lines: 89 Nntp-Posting-Host: beowulf.ucsd.edu The following TR has been placed in the neuroprose archives at Ohio State University. --Bill -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= UCSD CSE Technical Report No. CS91-180 Active selection of training examples for network learning in noiseless environments. Mark Plutowski Department of Computer Science and Engineering, UCSD, and Halbert White Institute for Neural Computation and Department of Economics, UCSD. Abstract: We derive a method for {\sl actively selecting} examples to be used in estimating an unknown mapping with a multilayer feedforward network architecture. Active selection chooses from among a set of available examples an example which, when added to the previous set of training examples and learned, maximizes the decrement of network error over the input space. %New examples are chosen according to %network performance on previous training examples. In practice, this amounts to incrementally growing the training set as necessary to achieve the desired level of accuracy. The objective is to minimize the data requirement of learning. Towards this end, we choose a general criterion for selecting training examples that works well in conjunction with the criterion used for learning, here, least squares. Examples are chosen to minimize Integrated Mean Square Error (IMSE). IMSE embodies the effects of bias (misspecification of the network model) and variance (sampling variation due to noise). We consider a special case of IMSE, Integrated Squared Bias, (ISB) to derive a selection criterion ($\Delta ISB$) which we maximize to select new training examples. $\Delta ISB$ is applicable whenever sampling variation due to noise can be ignored. We conclude with graphical illustrations of the method, and demonstrate its use during network training. -=-=-=-=-=-=-=-=-=-=-=-=-=-= How to obtain a copy -=-=-=-=-=-=-=-=-=-=-=-=-=-= a) via FTP: To obtain a copy from Neuroprose, either use the "getps" program, or ftp the file as follows: % ftp cheops.cis.ohio-state.edu Connected to cheops.cis.ohio-state.edu. 220 cheops.cis.ohio-state.edu FTP server (Version 5.49 Tue May 9 14:01:04 EDT 1989) ready. Name (cheops.cis.ohio-state.edu:your-ident): anonymous [2331 Guest login ok, send ident as password. Password: your-ident 230 Guest login ok, access restrictions apply. ftp> cd pub/neuroprose 250 CWD command successful. ftp> binary 200 Type set to I. ftp> get plutowski.active.ps.Z 200 PORT command successful. 150 Opening BINARY mode data connection for plutowski.active.ps.Z (325222 bytes). 226 Transfer complete. local: plutowski.active.ps.Z remote: plutowski.active.ps.Z 325222 bytes received in 44 seconds (7.2 Kbytes/s) ftp> quit % uncompress plutowski.active.ps.Z % lpr -P plutowski.active.ps b) via postal mail: Requests for hardcopies may be sent to: Kay Hutcheson CSE Department, 0114 UCSD La Jolla, CA 92093-0114 and enclose a check for $5.00 payable to "UC Regents." The report number is: Technical Report No. CS91-180 Brought to you by Super Global Mega Corp .com