Asri-unix.1110 net.chess utzoo!decvax!ucbvax!ARPAVAX:C70:sri-unix!YODER@USC-ECLB Sat Mar 27 17:59:34 1982 Number of possible chess positions, plus a question First the question: what is the maximum possible number of legal moves one can have in a chess position? I consider promotions to different pieces to be different moves. I seem to recall this appearing in Scientific American (Mathematical Games section) sometime, but don't remember when. Reasonable upper bounds would be equally useful (my idea of reasonable is that the provable bound be not more than about 10% higher than the number achieved in an exhibited position). A while back I suggested a program for finding a better upper bound on the number of possible chess positions. Jerry Wedekind looked at it and found (correctly) that it isn't particularly feasible as stated. Hence a new program, which hopefully is. What we wish to do is to account for the fact that only some pawn structures are possible, and that some piece distributions aren't possible, e.g. ones with more than 16 of one color. It turns out that I. J. Good came up with estimates for this sort of thing in 1968 ("A Five Year Plan for Automatic Chess," in Machine Intelligence II, E. Dale & D. Michie (Eds.), New York: American Elsevier, 1968, pp. 89-118). In his appendix E, he says: "Shannon (1950, p. 260) states that the number of possible chess positions is of the general order of 64!/(32!*8!^2*2!^6) or roughly 10^43. The formula indicates that he was assuming that no pawn had been promoted. In Good (1951), I calculated that the number of positions in which no pawn has been promoted, and there are no doubled pawns, is less than 2*10^39. The number of positions in which no capture has occurred is about 10^32. Allowing for all possibilities the number is less than 64*63 sigma C(62,r)*10^r 0 <= r <= 30 which is about 2*10^50, but if the number of moves since the last capture or pawn move is regarded as part of the position, the upper bound is 10^52. I should say that 10^46 is correct for the total, within a factor of a thousand.... Conceivably a good estimate of the total number could be made if a very comprehensive categorisation of chess positions were first produced, in which positions would fall into clumps of trillions of positions each. Another approach would be to choose a large number of dispositions for the pieces, without pawns, and, for each of them estimate the number of ways of distributing the pawns legally -- in effect a method of stratified sampling." Unfortunately, the Good(1951) cited is listed in the references as a "Private communication to Mr. Leon Jacobson". It is curious that he considers the notion of including the number of moves since the last capture or pawn move in a position, but not whose turn it is to move (judging from his formula). In the following, I shall assume that who is on move is part of the position. As a preliminary, I wave my hands and claim that castling and e.p. capture do not add enough positions to the total to materially affect the calculations. I am not sufficiently interested in this question to wade through the details. (For castling the claim is pretty solid, for en passant somewhat less so. The idea would presumably be to show that the added positions were on the order of k/4096 of the total for small k.) Now let us improve on Good's formula: N < 2 * K0 * sigma C(62,r) sigma C(r,w) * 5^w * 5^(r-w) (0 <= r <= 30) (r-15 <= w <= 15) Here r = total number of pieces other than kings, w = number of white non-kings, and K0 = 4*60 + 24*58 + 36*55 = 3622 is the number of possible king configurations. The last two terms collapse to 5^r, which is independent of w and so can be pulled out of the inner summation. When this is done, we then have N < 7244 * sigma1 C(62,r)*5^r sigma2 C(r,w). By using the identity C(n,k) = C(n-1,k) + C(n-1,k-1) we at once see that the inner sum is largest for r = 15 and is decreasing with decreasing r (although the value for r = 14 is the same as for r = 15). Using C(30,15) as an upper bound for sigma2 we get the stage 0 estimate of N < 7244 * C(62,30) * C(30,15) * (5/4)*5^30 which is less than 2^166, assuming I flubbed no arithmetic. The program is to do special-case analysis on the terms in the above formula, starting with those for large values of r. This works well because each term is less than 1/5 that of the term above it (the terms with many pieces dominate those with few), and because (luckily) the terms with large r are easiest to analyze. Stage 1 is easily illustrated; I leave the harder parts of the program as an exercise to the reader, in the immemorial tradition of all writers on problems. If r = 30, all pieces are on the board. This implies that all pawns are on their original file, and are blocked by the opposing pawn, and hence that no promotions have happened. Thus the composition of white's and black's forces is known exactly. On each file, there are 15 possible configurations for the two pawns on that file, so the number of positions is bounded by 2 * 15^8 * (48!/32!*2!^6). A gross overestimate for this which is good enough for my purposes is 2 * (2^4)^8 * (2^6)^16 / 2^6 = 2^123. This is negligible in comparison with the r = 29 term, so we at once have improved the estimate to 2^164. I suspect that r = 29 and (with patience) r = 28 could also be done by hand, and that the actual information-theoretic minimum is quite a bit less than 160 bits. This jibes well with Good's opinion (10^46 within a factor of 1000 translates to 2^154[+-10] since he was ignoring who was to move). Based on the above analysis, Good's estimate of the number of positions in which no capture has occurred seems to be low. Further refinements to the above may be possible, for example using the fact that pawns cannot be on the 1st or 8th ranks to make a more accurate but more complicated summation. If anyone actually carries out such calculations, I would be interested in the results. -------