Bayesian Non-linear Modeling for the
Energy Prediction Competition
David J C MacKay
Bayesian probability theory provides a unifying framework for data
modeling. A model space may include numerous control parameters which
influence the complexity of the model (for example regularisation
Bayesian methods can automatically set such parameters so that the
model becomes probabilistically well-matched to the data.
The 1993 energy prediction competition
involved the prediction of a series of building energy loads from a
series of environmental input variables. Non-linear regression using
`neural networks' is a popular technique for such modeling tasks.
Since it is not obvious how large a time-window of inputs is
appropriate, or what preprocessing of inputs is best, this can be
viewed as a regression problem in which there are many possible input
variables, some of which may actually be irrelevant to the prediction
of the output variable. Because a finite data set will show random
correlations between the irrelevant inputs and the output, any
conventional neural network (even with `weight decay') will not set
the coefficients for these junk inputs to zero. Thus the irrelevant
variables will hurt the model's performance.
The Automatic Relevance Determination (ARD) model puts a prior over
the regression parameters which embodies the concept of relevance.
This is done in a simple and `soft' way by introducing multiple
`weight decay' constants, one `alpha' associated with each input.
Using Bayesian methods, the decay rates for junk inputs are
automatically inferred to be large, preventing those inputs from
causing significant overfitting.
An entry using the ARD model won the prediction competition by a significant
AUTHOR ="D. J. C. MacKay",
TITLE ="Bayesian non-linear modelling for the prediction
BOOKTITLE ="ASHRAE Transactions, V.100, Pt.2",
ADDRESS ="Atlanta Georgia",
ANNOTE ="Date submitted: ; Date accepted: ; Collaborating institutes: none"}