Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!mips!pacbell.com!att!linac!uwm.edu!csd4.csd.uwm.edu!markh
From: markh@csd4.csd.uwm.edu (Mark William Hopkins)
Newsgroups: comp.ai
Subject: Re: How do games learn? (was  Re: looking for name of a game)
Message-ID: <10757@uwm.edu>
Date: 4 Apr 91 06:38:05 GMT
References: <91090.130516ASNXS@ASUACAD.BITNET> <1991Apr1.130740.19670@IDA.ORG> <16087.27f874ec@levels.sait.edu.au>
Sender: news@uwm.edu
Organization: University of Wisconsin - Milwaukee
Lines: 27

In article <16087.27f874ec@levels.sait.edu.au> marwk@levels.sait.edu.au writes:
>How can one write a program that learns?  I do not see how.
>
>One has parameters in the heuristic evaluation functions, so how can these
>change based on experience?

("Least Squares Method")
By minimizing the square of an error function using any of a standard set of
minimization routines covered in numerical techniques texts...

It's up to you to determine how to form the error function.  The most natural
way is to use a search technique to find an optimum value for some evaluation
function looking several moves ahead, and use that as the basis for determining
corrections to be made on the evaluation function applied to the current board.

You need boundary conditions.  Evaluation on a terminal board is done without
search and error-correction to a WIN, LOSE (or DRAW).

Reinforcement learning uses more complicated techniques to accomplish the
the same goal of training a predictor.

Also, boundary conditions don't actually prove necessary all the time.  It is
sometimes possible to train the program to learn what constitutes a WIN, LOSS
or DRAW board from zero-knowledge.

I don't think the Samuel's checker playing program used any boundary
conditions...