Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!mips!pacbell.com!att!linac!uwm.edu!csd4.csd.uwm.edu!markh From: markh@csd4.csd.uwm.edu (Mark William Hopkins) Newsgroups: comp.ai Subject: Re: How do games learn? (was Re: looking for name of a game) Message-ID: <10757@uwm.edu> Date: 4 Apr 91 06:38:05 GMT References: <91090.130516ASNXS@ASUACAD.BITNET> <1991Apr1.130740.19670@IDA.ORG> <16087.27f874ec@levels.sait.edu.au> Sender: news@uwm.edu Organization: University of Wisconsin - Milwaukee Lines: 27 In article <16087.27f874ec@levels.sait.edu.au> marwk@levels.sait.edu.au writes: >How can one write a program that learns? I do not see how. > >One has parameters in the heuristic evaluation functions, so how can these >change based on experience? ("Least Squares Method") By minimizing the square of an error function using any of a standard set of minimization routines covered in numerical techniques texts... It's up to you to determine how to form the error function. The most natural way is to use a search technique to find an optimum value for some evaluation function looking several moves ahead, and use that as the basis for determining corrections to be made on the evaluation function applied to the current board. You need boundary conditions. Evaluation on a terminal board is done without search and error-correction to a WIN, LOSE (or DRAW). Reinforcement learning uses more complicated techniques to accomplish the the same goal of training a predictor. Also, boundary conditions don't actually prove necessary all the time. It is sometimes possible to train the program to learn what constitutes a WIN, LOSS or DRAW board from zero-knowledge. I don't think the Samuel's checker playing program used any boundary conditions...