Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!wuarchive!brutus.cs.uiuc.edu!apple!sun-barr!rutgers!dptg!att!cbnewsl!apr
From: apr@cbnewsl.ATT.COM (anthony.p.russo)
Newsgroups: comp.ai.neural-nets
Subject: Re: : Step Function. Biases are necessary
Summary: Biases are necessary, I'M CONVINCED
Keywords: learning,generalization
Message-ID: <1829@cbnewsl.ATT.COM>
Date: 11 Sep 89 12:22:46 GMT
References: <1060@rex.cs.tulane.edu> <6980@sdcsvax.UCSD.Edu> <2795@arisia.Xerox.COM>
Organization: AT&T Bell Laboratories
Lines: 54

I am absolutely convinced that bias is necessary for generalization.
When any machine is presented with (an incomplete number of) examples 
of a function and asked to generalize, that machine must choose between
all the possible functions that are consistent with the examples.
Its basis for choosing is *DEFINED* as its bias. Without bias, the
choice would be rather random, and generalization would be impossible.
Therefore, if we are to define learnability, it must be with respect to
a bias or set of biases.

Now, a bias can be any definable criteria (this may or may not exclude
"simplicity" as a bias). It can be in the form of hardwiring (net
topology) or previously learned information (weights).
This supports someone's comment that learnability should be dependent
on network architecture.

The question arises: which is more important, learning biases or functions?
Well, since generalization is not possible without biases, functions
cannot therefore be learned (only memorized). So, if you want a machine
to really learn a function (generalize on it), biases are more
important.

Ron Chrisley writes:

> [...] I do not see how the fact that
> generalization = bias implies the optimality of learning the boundary
> condsitions, and would be very interested in having you elaborate on why you
> think it might.
> 

My reply to this is to give a simplified, one-dimensional case.

A boundary is most efficiently (read: learning will be faster)
defined by its location in n-dimesional space. Since neural nets
don't learn this way, the next most efficient definition of a
boundary is obtained by giving examples of two items very (infintessimally)
close to the boundary  but on different sides of it.
In this way, in 1-D space for example, two points can define a boundary.
Those two points or examples are the most important ones to present
to the net. 

If, for instance, we wanted to teach the concept of 
negative and positive (zero is the boundary),
-1 and +1 (in integer space) would be a sufficient set of examples
(given, of course, some definition of bias).
Conversely, examples like -102312341 and +823456 are not very helpful.


 ~ tony ~

	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	~  	 Tony Russo		" Surrender to the void."	~
	~  AT&T Bell Laboratories					~
	~   apr@cbnewsl.ATT.COM						~
	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~