1
00:00:09,660 --> 00:00:12,480
So, in the previous section of this module
2
00:00:12,700 --> 00:00:15,434
we delve a little bit deeper into the mathematics
3
00:00:15,434 --> 00:00:18,240
of how transition matrices like these
4
00:00:18,240 --> 00:00:20,620
renormalize when you coarse-grain the data,
5
00:00:20,620 --> 00:00:23,450
what does the accompanying model do.
6
00:00:23,450 --> 00:00:27,430
And what we found is a kind of phase diagram here
7
00:00:27,635 --> 00:00:29,495
for the transition matricies
8
00:00:29,495 --> 00:00:32,430
and in fact what I've done is produced a general phase diagram
9
00:00:32,430 --> 00:00:35,974
that's good for any two-state Markov chain.
10
00:00:35,974 --> 00:00:38,584
And what we find is that in general,
11
00:00:38,710 --> 00:00:43,960
modulo some slightly tricky
things to do with the case
12
00:00:43,960 --> 00:00:47,280
where the first eigenvalue is imaginary,
13
00:00:47,280 --> 00:00:49,579
but modulo these kind of tricky cases,
14
00:00:49,840 --> 00:00:52,800
if you begin with a model anywhere in this space
15
00:00:52,804 --> 00:00:58,104
you eventually flow to some point on
this diagonal line here
16
00:00:58,654 --> 00:01:02,399
and on that diagonal line every update you do
17
00:01:02,399 --> 00:01:05,630
no matter where you begin
no matter what your initial distribution is
18
00:01:05,630 --> 00:01:07,850
one update will take you directly
19
00:01:07,850 --> 00:01:12,739
to the stationary distribution which is
now defined not by two parameters,
20
00:01:12,739 --> 00:01:14,530
but in fact by only one,
21
00:01:14,619 --> 00:01:18,634
you go from a two-dimensional manifold
down to a one-dimensional manifold
22
00:01:18,634 --> 00:01:19,744
and stay there.
23
00:01:19,849 --> 00:01:23,489
You don't flow once you hit this line.
You don't kind of move back or forth.
24
00:01:23,490 --> 00:01:28,610
It's sort of a line of fixed points
for this renormalization transformation.
25
00:01:28,610 --> 00:01:32,415
So, in this final section,
somewhat optional,
26
00:01:32,415 --> 00:01:35,045
I'd like to ask kind of fun question.
27
00:01:35,045 --> 00:01:37,170
So, we begin at this point here
28
00:01:37,170 --> 00:01:41,274
and let's take this matrix here where
the Epsilon's are equal to each other,
29
00:01:41,274 --> 00:01:43,694
they lie somewhere along that line,
30
00:01:44,079 --> 00:01:48,239
where the self-loop probability for A
is equal to the self-loop probability for B.
31
00:01:49,270 --> 00:01:54,584
What happened or where
did that matrix come from?
32
00:01:54,584 --> 00:02:00,384
Is it possible to find another matrix
that works at a finer time scale
33
00:02:00,384 --> 00:02:03,999
than you thought could have
been there in the first place?
34
00:02:03,999 --> 00:02:06,554
Another way to think of this is that
you've gathered some data
35
00:02:06,554 --> 00:02:09,014
in order to get this model here.
36
00:02:09,100 --> 00:02:13,160
But what if you were gathering
the data not rapidly enough
37
00:02:13,160 --> 00:02:16,580
to hit the actual sort of fundamental clock of the system.
38
00:02:16,580 --> 00:02:19,970
What if in fact your data was already
coarse-grained and you could imagine this?
39
00:02:19,970 --> 00:02:23,169
Imagine that you built this model
here off of some data,
40
00:02:23,169 --> 00:02:25,909
then your collaborator calls you up
and say: I got great news,
41
00:02:25,909 --> 00:02:31,260
we have this new high resolution
time-sensing device
42
00:02:31,260 --> 00:02:35,590
that enables us to actually measure not
every 10 minutes, but in fact every minute
43
00:02:36,010 --> 00:02:41,054
Well great, okay, what kind of model
44
00:02:41,054 --> 00:02:44,474
could T have been generated by?
45
00:02:46,900 --> 00:02:51,025
The mathematical way to answer this
question or to ask this question
46
00:02:51,025 --> 00:02:54,805
is to say can we find a transition matrix A
47
00:02:54,805 --> 00:02:58,740
such that when you square it you get the transition matrix T,
48
00:02:58,740 --> 00:03:01,434
and this is under the assumption that your collaborator has come up
49
00:03:01,434 --> 00:03:03,034
with a resolution device now that can split
50
00:03:03,610 --> 00:03:06,059
twice as fast as it originally could.
51
00:03:07,060 --> 00:03:12,919
So let's find it. Let's assume T
is given by this matrix form here
52
00:03:12,919 --> 00:03:16,269
and then all we have to do is figure out
53
00:03:17,610 --> 00:03:24,650
All we have to do now is figure out
what A and B are in this new transition matrix here.
54
00:03:24,650 --> 00:03:29,460
We have to square this matrix,
set it equal to T and solve for A and B
55
00:03:29,460 --> 00:03:33,140
and in fact I'll write this
in a slightly better fashion
56
00:03:33,140 --> 00:03:36,979
so that A and B become
the self-loop probabilities
57
00:03:40,009 --> 00:03:41,209
for the two states.
58
00:03:41,279 --> 00:03:48,439
So, our challenge becomes
find A and B.
59
00:03:49,550 --> 00:03:51,860
Okay, let's do it.
60
00:03:54,120 --> 00:03:59,369
Here is our problem. We're going to
square A or multiply A by itself.
61
00:03:59,369 --> 00:04:03,830
And we're going to set that equal
at the end to the transition matrix T.
62
00:04:03,970 --> 00:04:09,169
So if we do that multiplication
we find in the top corner here
63
00:04:09,169 --> 00:04:15,189
we have a² + 1 - B (1 minus a).
64
00:04:17,010 --> 00:04:20,045
And let's say down to the right
65
00:04:20,045 --> 00:04:26,075
we have "a" times "1 - b" plus "b" times "1 - b".
66
00:04:26,729 --> 00:04:34,569
Down here we have
"a" times "1 - a" plus "b" times "1 - a"
67
00:04:34,709 --> 00:04:36,680
and in the far lower right-hand corner
68
00:04:36,840 --> 00:04:43,120
we have
"1 - a" times "1 - b" plus "b" squared.
69
00:04:43,650 --> 00:04:50,660
This we're going to set equal to our original
and much-beloved transition matrix T.
70
00:04:52,229 --> 00:04:53,250
So
71
00:04:53,250 --> 00:04:55,949
We'll leave the full solution
to the homework,
72
00:04:55,949 --> 00:04:57,559
but I'll give you a little clue on how to do it
73
00:04:57,860 --> 00:05:01,700
which is that in fact A and B have to be equal.
74
00:05:01,875 --> 00:05:04,625
If A and B were not equal
we wouldn't be able to satisfy
75
00:05:04,625 --> 00:05:07,014
this set of equations here.
76
00:05:07,794 --> 00:05:10,910
And so once I tell you that A and B
are equal, then I can say
77
00:05:10,910 --> 00:05:14,610
Ok, let's try and solve
part of this problem
78
00:05:15,630 --> 00:05:20,404
by setting that term there equal to ϵ.
79
00:05:20,404 --> 00:05:25,854
So, what happens we have a squared
plus 1 minus a squared
80
00:05:25,854 --> 00:05:28,098
because I've told you that "a" is equal to "b",
81
00:05:28,099 --> 00:05:31,629
you can prove it yourself if you're going
to be distressed by what is about to happen.
82
00:05:32,690 --> 00:05:43,129
So I have a² + (1-a)² so that's
a² + 1 - 2a + a² = ϵ
83
00:05:43,129 --> 00:05:52,909
Using my amazing algebra skills we have
2a² - 2a + (1 - ϵ) = 0
84
00:05:54,129 --> 00:06:00,299
And now we use our quadratic equation
skills to solve for a.
85
00:06:00,299 --> 00:06:05,720
We have negative b plus or minus
root b squared that's 4
86
00:06:05,720 --> 00:06:11,790
minus 4 times a times c
87
00:06:11,890 --> 00:06:18,799
which is one minus epsilon all over
2 times the square term coefficient
88
00:06:18,799 --> 00:06:20,259
which is four.
89
00:06:20,259 --> 00:06:24,409
If I simplify that I get one half plus or minus
90
00:06:25,020 --> 00:06:27,449
And now I can pull out the factor of two here
91
00:06:27,999 --> 00:06:36,579
So we have one half square root one minus
2 times (1 - ϵ).
92
00:06:37,089 --> 00:06:39,299
Okay, so far so good.
93
00:06:40,389 --> 00:06:47,728
What I'd like you to notice is what
happens when let's say epsilon is equal to ¼.
94
00:06:50,349 --> 00:06:53,338
In this case under the square root sign here
95
00:06:53,339 --> 00:07:00,179
we have the square root of one minus two
times three over four which is equal to
96
00:07:00,999 --> 00:07:06,929
one minus three over two which is
equal to minus one half
97
00:07:10,300 --> 00:07:16,259
in other words the only way to satisfy
the equation A² = T
98
00:07:17,139 --> 00:07:20,199
is if some of the components of this matrix here
99
00:07:20,299 --> 00:07:24,538
have imaginary or generally complex entries.
100
00:07:25,629 --> 00:07:32,679
But if A has complex entries then it
corresponds or rather cannot correspond to any
101
00:07:32,679 --> 00:07:37,399
natural stochastic matrix because we
interpret the entries of A
102
00:07:37,399 --> 00:07:41,689
in fact as transition probabilities, and now we're
being asked to imagine
103
00:07:41,689 --> 00:07:45,164
that some of those transition probabilities are equal to things
104
00:07:45,164 --> 00:07:51,084
like ½ plus or minus ½ square root of the negative of minus ½
105
00:07:53,229 --> 00:07:58,279
So in fact what we found is that T is not
106
00:07:58,279 --> 00:08:02,769
or rather when epsilon is in fact as you can work out
107
00:08:02,769 --> 00:08:04,949
when epsilon is less than 1/2
108
00:08:06,129 --> 00:08:13,588
that matrix T could not have been generated by a
coarse-graining of any finer scaled two-state system.
109
00:08:15,009 --> 00:08:17,789
So where did T come from?
110
00:08:17,910 --> 00:08:23,200
Well, one way to see where the problem is is the following.
111
00:08:23,200 --> 00:08:34,130
If we imagine that the state or the system
is a good counter
112
00:08:35,990 --> 00:08:39,800
so that it oscillates ABABABAB
113
00:08:39,800 --> 00:08:46,959
and only very rarely slips and produces a BB
or AA or even more rarely a
114
00:08:47,209 --> 00:08:49,130
triple A or triple B
115
00:08:49,130 --> 00:08:51,669
If it's a good counter meaning epsilon is small
116
00:08:54,440 --> 00:08:58,190
What would this look like if we expanded it
at the finer timescale?
117
00:08:58,190 --> 00:09:07,280
In that case we'd have A ? B ? A ? B ? A
118
00:09:08,420 --> 00:09:12,150
and we'd have to fill in this gap here.
119
00:09:13,399 --> 00:09:15,874
But if we only had two states,
120
00:09:16,804 --> 00:09:19,114
if we only had the possibility of it filling in
with either A or B
121
00:09:22,280 --> 00:09:23,860
let's imagine for example
122
00:09:24,230 --> 00:09:27,020
we filled that question mark at that point with A.
123
00:09:27,020 --> 00:09:30,740
In that case at time t₁
124
00:09:31,470 --> 00:09:35,504
the system was in state A
and jumped directly to A
125
00:09:35,504 --> 00:09:39,194
and then at time t plus ½
126
00:09:39,194 --> 00:09:45,179
the finer timescale, it jumped from state A
directly to state B.
127
00:09:46,190 --> 00:09:48,814
But if it's doing this deterministically,
128
00:09:49,514 --> 00:09:52,174
then how can it know that at this time step
129
00:09:52,209 --> 00:09:53,944
it's in A and should jump to A
130
00:09:53,944 --> 00:09:57,724
and at this time step or at this time step here
it's an A again,
131
00:09:57,724 --> 00:09:59,840
but this time it should jump to B?
132
00:10:00,170 --> 00:10:04,954
Markov chains are memoryless, the only thing that
determines what a Markov chain will do next
133
00:10:04,954 --> 00:10:08,254
is the state that it happens to be in.
134
00:10:09,680 --> 00:10:12,959
And again, you can imagine filling this state in here with B.
135
00:10:12,959 --> 00:10:17,634
And now you have the same problem,
because here the state goes from A
136
00:10:17,634 --> 00:10:19,664
jumps to B, very good.
137
00:10:19,664 --> 00:10:24,719
Okay, here the state goes to B,
from B it jumps to B, okay good.
138
00:10:25,149 --> 00:10:31,599
Now if the state deterministically jumps from B to B,
we know this one here has to be B.
139
00:10:32,499 --> 00:10:34,803
But then we have a problem because we have to explain
140
00:10:34,803 --> 00:10:38,513
how the system somehow knows that
this B is different from the last B?
141
00:10:38,513 --> 00:10:42,599
How the system knows that okay, two ticks have gone
by and now I have to jump to A?
142
00:10:44,509 --> 00:10:48,988
Something has gone wrong.
In order to get T
143
00:10:49,319 --> 00:10:53,399
you have to derive it from an A with more memory.
144
00:10:56,689 --> 00:10:59,108
The clock is ticking back and forth,
145
00:10:59,109 --> 00:11:04,069
but now it stays for half a time step in one
of the ticks and the system has to remember
146
00:11:04,069 --> 00:11:05,829
okay, now it's time to go
147
00:11:07,009 --> 00:11:11,039
If epsilon is too small this process here is too reliable
148
00:11:11,039 --> 00:11:17,879
and it can't do it just by sort of stochastically
deciding to jump from B to A instead of B to B.
149
00:11:19,579 --> 00:11:24,639
Can we find another solution,
can we find an A that will work for T?
150
00:11:24,639 --> 00:11:26,888
Well, we know the one thing we're
going to probably have to do
151
00:11:26,888 --> 00:11:28,968
is give A more states than T.
152
00:11:28,968 --> 00:11:32,388
And so here is a solution that will allow you,
153
00:11:32,388 --> 00:11:36,428
I will show how a Markov chain can coarse-grain
154
00:11:36,428 --> 00:11:40,518
all the way down to the T matrix you originally saw.
155
00:11:40,518 --> 00:11:42,134
So it's going to be a little tricky here.
156
00:11:42,134 --> 00:11:45,694
I'm going to draw, instead of a two-state system
157
00:11:45,694 --> 00:11:48,349
i'm going to draw a four state system,
158
00:11:48,349 --> 00:11:52,799
and now there are states A, B, C and D.
159
00:11:53,599 --> 00:11:55,569
The transitions will look like this:
160
00:11:55,569 --> 00:11:58,279
A goes to B with probability 1 minus epsilon,
161
00:11:58,279 --> 00:12:00,299
B always goes to C,
162
00:12:01,189 --> 00:12:04,018
C can occasionally slip back to B,
163
00:12:04,309 --> 00:12:06,989
C usually goes to D,
164
00:12:07,339 --> 00:12:10,109
D goes deterministically to A,
165
00:12:10,109 --> 00:12:14,539
and A goes back to D
with only some tiny probability.
166
00:12:14,539 --> 00:12:16,329
So, what do we have here?
167
00:12:16,329 --> 00:12:18,459
We have a hidden Markov model
168
00:12:18,459 --> 00:12:22,234
but now it's over 4 states and not just 2 states
169
00:12:22,234 --> 00:12:25,344
what I'm going to do is draw what happens
170
00:12:25,344 --> 00:12:28,808
if you take this where the matrix represented
171
00:12:29,629 --> 00:12:31,869
by this and square it
172
00:12:31,869 --> 00:12:34,579
and what we find is the following.
173
00:12:36,289 --> 00:12:38,379
The system actually splits up
174
00:12:40,789 --> 00:12:43,148
like so. In fact, if you're in state A
175
00:12:43,609 --> 00:12:48,349
after two time steps of this evolution
by this A matrix operator here,
176
00:12:48,540 --> 00:12:54,450
you jump to see with probability one minus epsilon
and you stay in A with probability epsilon.
177
00:12:55,620 --> 00:12:58,610
It's also true that if you happen to be in state D,
178
00:12:59,250 --> 00:13:03,800
you jump to B with probability one minus epsilon,
you stay in D with probability epsilon.
179
00:13:03,800 --> 00:13:10,810
This is now equivalent to the original operator,
the original Model T
180
00:13:10,810 --> 00:13:14,480
as long as we do the following coarse-graining of the states.
181
00:13:17,570 --> 00:13:25,700
If we say A and D are both equivalent to state A
in the observations
182
00:13:25,700 --> 00:13:30,590
and C and B are both equivalent to state B
183
00:13:30,900 --> 00:13:32,864
in the coarse-grained observations,
184
00:13:32,864 --> 00:13:38,164
all of a sudden A², this matrix here, reduces down to the original
185
00:13:40,384 --> 00:13:42,414
evolution matrix we had
186
00:13:43,274 --> 00:13:46,580
for the slippy AB counter.
187
00:13:47,970 --> 00:13:52,040
The big lesson we get out of this
is that sometimes in fact
188
00:13:52,350 --> 00:13:54,580
a matrix or a model in general
189
00:13:54,580 --> 00:13:59,310
could have been generated by something
outside of the original model space.
190
00:13:59,730 --> 00:14:03,410
When you coarse-grain the data and get a new model
191
00:14:03,410 --> 00:14:05,760
the model that you get may be a different,
192
00:14:05,760 --> 00:14:08,550
fundamentally different kind of model
than the one that you began with.
193
00:14:08,550 --> 00:14:11,435
In this case here it began with a 4-state model
194
00:14:11,435 --> 00:14:14,385
and coarse-grained it down to a two state model
195
00:14:14,760 --> 00:14:16,460
looking at it in reverse
196
00:14:16,465 --> 00:14:19,695
you say that the 2-state model, that T transition,
the slippy AB counter,
197
00:14:21,930 --> 00:14:26,549
if the counter is reliable enough or
epsilon is in fact less than a half
198
00:14:26,549 --> 00:14:29,389
then this construction here will work.
199
00:14:29,490 --> 00:14:32,030
But what you have to do is you have to assume
200
00:14:32,035 --> 00:14:36,355
that the finer-grained data was created
by a process with more memory,
201
00:14:36,900 --> 00:14:38,850
right, four states
202
00:14:38,850 --> 00:14:42,210
instead of just two and you can see
the deterministic transitions here
203
00:14:42,210 --> 00:14:48,140
allow you to count whether or not it's time
to jump out of the A or the C state yet.