### Bayesian Game Tree Search

*Coherent Inference on Optimal Play in Game Trees* [pdf]

Philipp Hennig, David Stern and Thore Graepel

Proceedings of the 13th International Conference on Artificial
Intelligence and Statistics (AISTATS), 2010, Chia Laguna Resort,
Sardinia, Italy.**Journal of Machine Learning
Research: W&CP 9 (2010), 326-333.** [link]

The task of playing a zero-sum, round-based, two-player game (such as Chess or Go) optimally is simulatenously one of the oldest and one of the most challening problems of computer science (human masters still outperform computers in Go, for example). It is an instance of the Tree Search Problem.

Solving such search problems exactly requires a search through the entire tree, which is an exponential space (while there are some smart logical tricks, such as alpha-beta search, their complexity remains exponential). Because real game trees can be very large (Go has about 10^400 nodes, very many more than there are atoms in the Universe), it is physically impossible to solve them exactly.

However, it turns out that in most interesting tree search problems, winning and losing terminal positions are not uniformly distributed in the tree. There is a result due to Judea Pearl, according to which such uniform random structure in AND/OR trees, such as games, leads to almost surely determined outcome of the game. This would be very boring in a real game, so real game trees cannot be random. In fact, although it is rarely mentioned explicitly, contemporary Monte Carlo tree search algorithms already exploit this structure, although not optimally.

In this work, we present an explicit generative probabilistic
model for the values of leaf nodes in game trees, along with an
approximate inference algorithm, which can leverage data from
non-optimal, random play (so-called *rollouts*) to infer
probability distributions over the value of any node under
*optimal* play. Most importantly, both the inference from
rollouts, and evaluation of the belief over optimal play are of
linear complexity in the depth and branching factor of the tree
(not exponential in the depth, as for exact
algorithms). So, analogous to the way humans think about games,
inference in our model depends more crucially on the amount of
data collected than on the size of the tree itself.