The data processing theorem states that data processing destroys information.
Prove this by considering an ensemble WDR in which w is the
state of the world, d is data gathered, and r is the
processed data, so that these three variables form
that is, the probability P(w,d,r) can be written as
Show that the information that R conveys about W, H(W;R), is less than or equal to the information that D conveys about W, H(W;D). Incidentally, this theorem is as much a caution about our definition of `information' as it is a caution about data processing!