1
00:00:00,199 --> 00:00:05,565
So the one last thing we have to do is show that our general case formula
2
00:00:05,565 --> 00:00:09,335
agrees with our special case formula, where our special case formula
3
00:00:09,335 --> 00:00:13,309
was when all the messages had equal probability.
4
00:00:13,309 --> 00:00:16,467
So let me give our general case formula, again.
5
00:00:16,467 --> 00:00:25,645
If x is a message source, then the information content of that message
6
00:00:25,645 --> 00:00:31,308
source is equal to the sum from i = 1 to m---that is, we're going to sum
7
00:00:31,308 --> 00:00:37,966
up all these terms of the probability of message i times the log base 2
8
00:00:37,966 --> 00:00:43,318
of the probability of message i. That was our original, general formula.
9
00:00:43,318 --> 00:00:50,796
Ok, now, what if we have all messages have equal probability? That means
10
00:00:50,796 --> 00:00:57,561
that the probabilities are just equal to 1/m! So if there's two messages,
11
00:00:57,561 --> 00:01:03,654
each one has probability 1/2, if there's three, each has probability 1/3,
12
00:01:03,654 --> 00:01:17,594
and so on. So, in that case, H(x) = - sum from i = 1 to m of, well, p-sub-i
13
00:01:17,594 --> 00:01:29,237
is 1/m, times log base 2 of 1/m, and if we sum up this m-times, it's just
14
00:01:29,237 --> 00:01:32,880
going to be this plus this plus this m times---that's going to get rid of the
15
00:01:32,880 --> 00:01:41,918
1/m term---so that's going to be equal to - log base 2 of 1/m---well, - log
16
00:01:41,918 --> 00:01:50,086
base 2 of 1/m is just equal to log base 2 of m, using our log rules, and that's
17
00:01:50,086 --> 00:01:53,776
the same that we had in our special case, when all the probabilities were
18
00:01:53,776 --> 00:01:55,107
equal.
19
00:01:55,107 --> 00:02:00,523
So, stepping back, the reason Shannon wanted to do all this measuring
20
00:02:00,523 --> 00:02:06,053
of information content was for the purpose of coding and, in general,
21
00:02:06,053 --> 00:02:10,837
for optimal coding of signals that go over telephone wires. He showed
22
00:02:10,837 --> 00:02:16,837
that the information content, as defined earlier, gives the average number
23
00:02:16,837 --> 00:02:21,753
of bits that it takes to encode a message, say, a signal over a telephone
24
00:02:21,753 --> 00:02:27,654
line, from a given message source, given an optimal coding---and Shannon's
25
00:02:27,654 --> 00:02:34,479
original paper, which was turned into a book, shows that rigorously mathematically.
26
00:02:34,479 --> 00:02:40,805
So the idea is that you have a particular set of messages, that have particular
27
00:02:40,805 --> 00:02:46,545
probabilities that you've measured. You can find the optimal number
28
00:02:46,545 --> 00:02:54,104
of bits that it would take to encode each message on average, and this
29
00:02:54,104 --> 00:02:58,538
basically gives the amount of compressability of the text---the higher the
30
00:02:58,538 --> 00:03:02,807
information content, the less compressable. So if you're interested in
31
00:03:02,807 --> 00:03:08,192
going further with this, you can google the notion of Huffman coding, which
32
00:03:08,192 --> 00:03:12,263
shows how to do optimal compression. I'm not going to talk more about
33
00:03:12,263 --> 00:03:15,924
this during this course, but it's a really interesting and important area that's
34
00:03:15,924 --> 00:03:23,145
affected our ability, in general, to use technologies such as telephone communication,
35
00:03:23,145 --> 00:03:28,254
internet communication, and so on, so it's extremely important.
36
00:03:28,254 --> 00:03:33,073
Finally, I want to say a few words about the notion off meaning with
37
00:03:33,073 --> 00:03:35,830
respect to Shannon information content.
38
00:03:35,830 --> 00:03:40,428
You probably noticed that Shannon information content, while it's about
39
00:03:40,428 --> 00:03:45,371
probabilities and numbers of messages produced by a message source,
40
00:03:45,371 --> 00:03:49,284
it has nothing to say about the meaning of the messages, the meaning
41
00:03:49,284 --> 00:03:54,217
of the information, what function it might have for the sender or the
42
00:03:54,217 --> 00:03:59,199
receiver. Really, the meaning of information comes from information processing,
43
00:03:59,199 --> 00:04:04,300
that is what the sender or receiver *does* upon sending or receiving
44
00:04:04,300 --> 00:04:09,649
a message. We're going to talk about this in detail in unit 7, when we talk
45
00:04:09,649 --> 00:04:13,389
about models of self-organization, and how self-organizing systems
46
00:04:13,389 --> 00:04:18,672
process information, in order to extract meaning.