Statement

Let , , and be random variables that form a Markov chain, denoted as . The Data Processing Inequality states that:

Proof

By exploiting the chain rule for mutual information, the joint mutual information can be rewritten in two distinct ways, yielding the following system:

By equating the right-hand sides of both expansion:

Since:

it follows that:

Therefore:


Interpretation

Important

By processing the output , the information on can only be reduced