The data processing theorem states that data processing destroys information.
Prove this by considering an ensemble *WDR* in which *w* is the
state of the world, *d* is data gathered, and *r* is the
processed data, so that these three variables form
a chain

that is, the probability *P*(*w*,*d*,*r*) can be written
as

Show that the information that *R* conveys about *W*, *H*(*W*;*R*),
is less than or equal to the information that *D* conveys about *W*,
*H*(*W*;*D*).
Incidentally, this theorem is as much a caution about our definition
of `information' as it is a caution about data processing!

Sat May 10 23:05:10 BST 1997