So the one last thing we have to do is show that our general case formula agrees with our special case formula, where our special case formula was when all the messages had equal probability. So let me give our general case formula, again. If x is a message source, then the information content of that message source is equal to the sum from i = 1 to m---that is, we're going to sum up all these terms of the probability of message i times the log base 2 of the probability of message i. That was our original, general formula. Ok, now, what if we have all messages have equal probability? That means that the probabilities are just equal to 1/m! So if there's two messages, each one has probability 1/2, if there's three, each has probability 1/3, and so on. So, in that case, H(x) = - sum from i = 1 to m of, well, p-sub-i is 1/m, times log base 2 of 1/m, and if we sum up this m-times, it's just going to be this plus this plus this m times---that's going to get rid of the 1/m term---so that's going to be equal to - log base 2 of 1/m---well, - log base 2 of 1/m is just equal to log base 2 of m, using our log rules, and that's the same that we had in our special case, when all the probabilities were equal. So, stepping back, the reason Shannon wanted to do all this measuring of information content was for the purpose of coding and, in general, for optimal coding of signals that go over telephone wires. He showed that the information content, as defined earlier, gives the average number of bits that it takes to encode a message, say, a signal over a telephone line, from a given message source, given an optimal coding---and Shannon's original paper, which was turned into a book, shows that rigorously mathematically. So the idea is that you have a particular set of messages, that have particular probabilities that you've measured. You can find the optimal number of bits that it would take to encode each message on average, and this basically gives the amount of compressability of the text---the higher the information content, the less compressable. So if you're interested in going further with this, you can google the notion of Huffman coding, which shows how to do optimal compression. I'm not going to talk more about this during this course, but it's a really interesting and important area that's affected our ability, in general, to use technologies such as telephone communication, internet communication, and so on, so it's extremely important. Finally, I want to say a few words about the notion off meaning with respect to Shannon information content. You probably noticed that Shannon information content, while it's about probabilities and numbers of messages produced by a message source, it has nothing to say about the meaning of the messages, the meaning of the information, what function it might have for the sender or the receiver. Really, the meaning of information comes from information processing, that is what the sender or receiver does upon sending or receiving a message. We're going to talk about this in detail in unit 7, when we talk about models of self-organization, and how self-organizing systems process information, in order to extract meaning.