In the previous module,
I gave you a cartoon example
of how a Markov chain coarse-grains
as you go to longer
and longer time blocks.
So, I gave you a Markov chain
that worked, let’s say,
at one second resolution,
and then I showed you
how to build a Markov chain
that worked if your time
resolution was poor.
So, for example let’s say,
you could only observe the first second
of each two second block.
That decimation coarse graining
transforms the underlying model itself,
and so, as you saw,
when we went from the fine-grained model,
Markov chain,
and coarse-grained it
on blocks of scale 2,
that Markov chain seemed to get
a little more complicated,
and all I mean by that,
it seemed to have more arrows –
we don’t really have good way
to judge the complexity of a Markov chain.
But the model certainly changed,
and it didn’t necessarily
change for the better.
Even though we simplified
our observations,
the model itself seemed to do something,
well, not necessarily weird,
but something that didn’t seem to have
a particular structure to how it changed.
But if we coarse-grained
on sufficiently long timescales,
it appeared that the Markov chain
coarse-grained to a simpler case
where the outgoing probabilities
for each of the states were the same.
So, each state had the same probability
to go to state A, B, C, and D
as the other states did.
In this module, here, what I’m going to do
is show you how to do
a little bit of linear algebra
to actually make that concept rigorous,
to talk about how a Markov
chain will, eventually,
if you coarse-grain enough,
simplify into a small into a smaller
subspace of all possible models.
This would be one of our first examples
of a renormalization group transformation
that takes you
to what we call a fixed point.
So, what I’m going to do
is give you that example
using a much simpler Markov chain
which I’ve drawn on the board here.
This Markov chain has two states, A and B.
What I’ve done,
instead of writing
exact numerical probabilities
for each of these transitions,
I’ve just parameterized the system
with one parameter ϵ.
So, in this case here,
what we have is a system
that can jump from A to B
with probability 1 - ϵ,
or it can stay in state A
with probability ϵ.
If we make ϵ small,
that means that the probability
to stay in state A is small,
so the system will tend
to oscillate back and forth –
it will go ABABAB.
As ϵ gets larger and larger,
it will have a certain probability
to stay in state A.
So, it might it might, in fact,
go AAABBBABAABB,
it’s a kind of slippy counter
as ϵ gets larger and larger.
So, a simple way
to write or describe this model
is to imagine that we describe
the state of the system
using your column vector.
In this case the column vector
will have two entries:
the first entry will be the probability
that you’re in state A;
the second entry will be the probability
that you’re in state B.
Now we can describe
the evolution of that system over time
using a two-by-two matrix
where the entries of the matrix
look like this.
Each term here
is a conditional probability
that tells you how you move
depending upon which state you’re in.
So, for example, we can see
if the matrix is written like this,
where, for example, the first term
in the top-left corner here of this matrix
is the probability
that given you’re in state A,
what’s the probability
that you transition to state A,
stay in state A.
Or here, for example,
if you’re in the top-right corner,
this is the probability that,
given you’re in state B,
you transition to state A.
Notice that when you do
the matrix multiplication here,
and I’ll just do the top one
to give you a sense of how this works,
in case you haven’t seen it before,
This top entry here now says
that the probability
that you’re in state A
after one time step
is the probability
that you begin in state A
and transition to the same state A,
you stay in that state,
the probability of A given A,
plus the probability
that if you begin in state B,
the probability of state B,
times the probability
of transitioning from B to A.
And then we’ll leave as a homework example
for you to compute the second term here.
So what we see is we can now
evolve the system through time
by multiplying using this matrix here,
which is sometimes called
stochastic matrix.
One of the things
that you can see right away
that makes this matrix special
compared to all two-by-two matrices
is that all the columns here
have to sum to 1.
If you’re in state A,
one of two things
in this case must happen.
You either stay in the state
or you transition to another state.
There’s no other possibilities,
and the probabilities therefore exhaust
the space and have to sum to 1.
Another feature, of course,
of a stochastic matrix
is none of these entries can be negative.
So stochastic matrices are
an interesting sort of subspace
of all the possible two-by-two matrices
that you will encounter in your life.
Once you’ve represented the Markov chain
as a stochastic matrix,
and I’ll write a particular Markov chain
that we have here in this new form,
it becomes a simple matter
to talk about what happens
if you skip observations.
If the system jumps 2 time-steps
before you see it again,
all I do is take this evolution matrix T –
sometimes we call it an evolution operator
if we’re feeling a little punchy –
all we do is take
this evolution operator T
and multiply it by itself.
And now what we have
is when this matrix
acts on a probability vector
it takes it one step in the future.
You act on that matrix again
on the results of that,
and it takes you two steps in the future.
And so the Markov chain corresponding
to the observation series
that’s been coarse-grained
in blocks of two,
and this is decimation course graining
where we take the first observation,
we can now write that very simply
as the product of this matrix with itself.
So I'll write that out for you,
and just show you how the first term goes.
The probability that if you’re in state A,
and you stay in state A two steps later,
is equal to this product here:
ϵ² + (1 - ϵ)²
And we’ll leave these other entries
as an exercise, as you please.
In words what this says
is the probability
that if you begin in state A,
you’ll be in state A two steps later,
is the sum of two possibilities:
the first is, of course,
if you stay in the state A –
if you just stay in the state A
on the second time-step,
or on the first time-step
and the second time-step.
So the probability
of that happening is ϵ²,
the probability that you stay in A twice;
or the other possibility
is that you start in state A
on the second time-step
you jump to state B,
and then you quickly jump back
to state A before anybody notices.
And that’s the second term here (1 - ϵ)².
So now we can see that in some sense
it’s a rather trivial problem
to see what happens to this matrix
as you continue to coarse-grain:
if you take blocks of 3, 4, or 5,
all you have to do
to see the resulting model
is raise T to the power
that you’re interested in.
Here n is now just the block size.