Beamer template

Bayesian Game Tree Search

Coherent Inference on Optimal Play in Game Trees [pdf]
Philipp Hennig, David Stern and Thore Graepel
Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS), 2010, Chia Laguna Resort, Sardinia, Italy.
Journal of Machine Learning Research: W&CP 9 (2010), 326-333. [link]

The task of playing a zero-sum, round-based, two-player game (such as Chess or Go) optimally is simulatenously one of the oldest and one of the most challening problems of computer science (human masters still outperform computers in Go, for example). It is an instance of the Tree Search Problem.

Solving such search problems exactly requires a search through the entire tree, which is an exponential space (while there are some smart logical tricks, such as alpha-beta search, their complexity remains exponential). Because real game trees can be very large (Go has about 10^400 nodes, very many more than there are atoms in the Universe), it is physically impossible to solve them exactly.

However, it turns out that in most interesting tree search problems, winning and losing terminal positions are not uniformly distributed in the tree. There is a result due to Judea Pearl, according to which such uniform random structure in AND/OR trees, such as games, leads to almost surely determined outcome of the game. This would be very boring in a real game, so real game trees cannot be random. In fact, although it is rarely mentioned explicitly, contemporary Monte Carlo tree search algorithms already exploit this structure, although not optimally.

In this work, we present an explicit generative probabilistic model for the values of leaf nodes in game trees, along with an approximate inference algorithm, which can leverage data from non-optimal, random play (so-called rollouts) to infer probability distributions over the value of any node under optimal play. Most importantly, both the inference from rollouts, and evaluation of the belief over optimal play are of linear complexity in the depth and branching factor of the tree (not exponential in the depth, as for exact algorithms). So, analogous to the way humans think about games, inference in our model depends more crucially on the amount of data collected than on the size of the tree itself.