Introduction to Information Theory
Lead instructor: Seth Lloyd
6.1 Mutual Information » Quiz Solutions
Question 1:
Let x_{h} indicate the outcome that the first toss was a heads, x_{t} the outcomes that the first toss was a tails, y_{h} the outcomes that the second toss was a heads, and y_{t} the outcomes that the second toss was a tails.
Then, there are four possible joint outcomes of tossing a coin twice in a row: (x_{h}, y_{h}), (x_{h}, y_{t}), (x_{t}, y_{h}), and (x_{t}, y_{t}). The joint information can then be written,
I(XY)= -P(x_{h},y_{h}) log_{2} P(x_{h},y_{h}) -P(x_{h},y_{t}) log_{2} P(x_{h},y_{t}) -P(x_{t},y_{h}) log_{2} P(x_{t},y_{h}) -P(x_{t},y_{t}) log_{2} P(x_{t},y_{t})
Let us assume that getting a tails (or heads) on the first toss does not change the probability of getting a tails or heads on the second toss. Then we can write P(x_{h},y_{h}) = P(x_{h})*P(y_{h}), P(x_{h}) is the probability of landing a heads on single toss. Since the coin is assumed to be fair, P(x_{h}) = P(x_{t}) = 0.5, meaning that P(x_{h},y_{h}) = 0.5*0.5 = 0.25.
The same arguments applies to all four of the possible outcomes, not just the (x_{h},y_{h}) outcome. Thus, all four outcomes have probability 0.25.
Plugging this into the above equation for I(XY) gives
I(XY) = -4*0.25*log_{2} 0.25 = 2 bits
Question 2:
The mutual information is given by sum of the marginal information of X plus the marginal information of Y, minus their joint information:
I(X:Y) = I(X) + I(Y) - I(XY)
We know that the amount of information in a single toss of a fair coin is 1 bit, so I(X) (the information in the first toss) is 1 bit, and I(Y) (the information in the second toss) is 1 bit.
Given the answer to the previous question, we also know that the joint information I(XY) = 2 bits.
Combining gives
I(X:Y) = 1 bit + 1 bit - 2 bits = 0 bits
Thus, the resulting mutual information is then 0 bits.
Question 3:
Remember the rule for finding marginal probabilities given joint probabilities, as stated in this part of the video:
P(x_{rain}) = P(x_{rain},y_{no-sun}) + P(x_{no-rain},y_{sun})
Plugging in the probabilities specified in the question, we get
P(x_{rain}) = 0.3 + 0.1 = 0.4
Question 4:
Recall again the formula for mutual information,
I(X:Y) = I(X) + I(Y) - I(XY)
We first compute the marginal information terms, I(X) and I(Y), and the joint information term, I(XY).
We use the marginalization rule (see this part of the video) to calculate the marginal probabilities of X:
P(x_{rain}) = P(x_{rain }, y_{no-sun}) + P(x_{rain }, y_{sun}) = 0.3 + 0.1 = 0.4
P(x_{no-rain}) = P(x_{no-rain }, y_{no-sun}) + P(x_{no-rain }, y_{sun}) = 0.2 + 0.4 = 0.6
We can do the same to calculate the marginal probabilities of Y:
P(y_{sun}) = P(x_{rain }, y_{sun}) + P(x_{no-rain }, y_{sun}) = 0.1 + 0.4 = 0.5
P(y_{no-sun}) = P(x_{no-rain }, y_{no-sun}) + P(x_{rain }, y_{no-sun}) = 0.2 + 0.3 = 0.5
Recalling the Fundamental Formula for the information in a random variable with two outcomes, we compute
I(X) = -0.4 log2 0.4 - 0.6 log2 0.6 = 0.97
and
I(Y) = -0.5 log2 0.5 - 0.5 log2 0.5 = 1
Then, we compute the joint information by consider all four possible joint outcomes of X and Y (also see the answer to the last question):
I(XY) = - P(x_{rain }, y_{sun}) log_{2} P(x_{rain }, y_{sun}) - P(x_{no-rain }, y_{sun}) log_{2} P(x_{no-rain }, y_{sun})
- P(x_{rain }, y_{no-sun}) log_{2} P(x_{rain }, y_{no-sun}) - P(x_{no-rain }, y_{no-sun}) log_{2} P(x_{no-rain }, y_{no-sun})
= - 0.1 log_{2} 0.1 - 0.4 log_{2} 0.4 - 0.3 log_{2} 0.3 - 0.2 log_{2} 0.2 ≈ 1.84
Finally, we combine and compute the mutual information:
I(X:Y) = I(X) + I(Y) - I(XY) = 0.97 + 1 - 1.84 = 0.13 bits