Xref: utzoo rec.games.misc:15362 comp.ai:8946
Newsgroups: rec.games.misc,comp.ai
Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!sol.ctr.columbia.edu!IDA.ORG!rlw
From: rlw@IDA.ORG (Richard Wexelblat)
Subject: Re: How do games learn? (was  Re: looking for name of a game)
Message-ID: <1991Apr4.200856.3341@IDA.ORG>
Reply-To: rlw@IDA.ORG.UUCP (Richard Wexelblat)
Organization: IDA, Alexandria, VA
References: <91090.130516ASNXS@ASUACAD.BITNET> <1991Apr1.130740.19670@IDA.ORG> <16087.27f874ec@levels.sait.edu.au>
Date: Thu, 4 Apr 91 20:08:56 GMT

In article <16087.27f874ec@levels.sait.edu.au> marwk@levels.sait.edu.au writes:
>How can one write a program that learns?  I do not see how.
>
>One has parameters in the heuristic evaluation functions, so how can these
>change based on experience?

If you define learning as "plays better over time, based on experience"
then the trivial algorithm I used was (in short form) thus.

Assume a set of numeric values that are combined in some way to
determine the value of a move.  Determine in a suitable way (see below)
that in a given situation Move B (which the program didn't make) would
have been better than Move A (which the computer DID make).  Bump the
values a delta to increase the likelihood of B being made next time an
A/B choice is encountered.

How to determine a better move?  Ask an expert... or play out all variations
of an endgame from a certain point... or play a fixed vs. a randomly
modified strategy.  There are many variants based on having a program
play against itself.

-- 
--Dick Wexelblat  (rlw@ida.org) 703 845 6601
  Can you accept an out of state sanity check?