So there's some subtleties or tricky points
that we're going to have to think through

regarding this idea of randomness as incompressibility.

So, here is the first one that isn't too serious an issue.

And that is, we need to choose some standard way
of representing these compressions.

So let's say that the task we give

give ourselves is to compress a certain string, 
a sequence of symbols,

and we're going to write a computer program to do that

and we want to write the shortest computer program possible.

So, um, I like programming in Python

so, um, I might try to write a Python program

and somebody else might write a program in C

and somebody else might do C++

and somebody might do Lisp

and who knows what else everybody would do.

But we would all be working in different languages,

and the algorithm would have different representations
in these different languages.

So we need to agree on some sort of standard
or reference, um, device

or language that we can use
when we're comparing these algorithms.

And so the standard thing that's done here

in this sort of formal, um, development of these ideas,

is to use something called a universal Turing machine.

So a universal Turing machine is

a device that, um, can do anything

can sort of perform any computation,

any discrete time, discrete space computation,

or discrete state computation, excuse me.

Um, and is capable of emulating any other computer

or device that can do that computation.

So it's sort of an all-encompassing computing device

and it's a standard reference device,

a model of computation
that's as powerful as anything else.

So in the formal development of these ideas
which is not part of this course,

um, one works in a framework 
where you're comparing things

in terms of different universal Turing machines.

So that's one subtlety that we have to address.

And, OK, we've addressed that.

We'll use universal Turing machines.

The second subtlety is more subtle,

and, uh, more serious.

And that is, how do we know if we've found

the shortest algorithm that can produce

that can reproduce a sequence.

Maybe a sequence really is compressible

but I just haven't been able to figure it out.

Or I'm not smart enough,

or I haven't given myself enough time.

Like that could be the case for pi,

that certainly looked random.

And by the way, in some view,
in some ways of viewing it,

pi really is random.

All the digits,
if you do a binary expansion,

um, it would look, uh, the same as
a fair coin.

All possible sequences of zeros and ones
occur equally often in the binary expansion of pi.

So that's weird.

So in a certain sense,
pi is random.

But in this algorithmic sense,

or if we're using universal Turing machines,

that type of algorithm makes sense.

Pi isn't random,
because there's a short algorithm to produce it.

OK, so, um, maybe every possible sequence

actually has some almost impossible to figure out

short algorithm that can reproduce it,

that we just haven't been smart enough to figure out yet.

So it turns out we can eliminate that possibility.

So let me present that argument as follows.

So we're wondering if we can figure out compression algorithms,

ways to compress sequences that it seems not obvious
how you might be able to compress them.

So one idea is,

well, maybe we could come up with an algorithm
that determines the shortest algorithm.

An algorithm for compression
that's optimal all the time.

So this would be an algorithm
that writes an algorithm.

Or it's like a program that writes a program.

And moreover, this program would automatically
write the best program

in the sense that it was the shortest program.

The one that could compress our symbol sequence
as much as possible.

So, this would be an amazing thing, but

one can show that such a program
doesn't exist.

You can't write a program that automatically 
writes the shortest program.

So, um, so that idea, so we can't

we can't sort of automatically determine

if there is a shortest algorithm.

But maybe, um, there is a shortest algorithm
out htere,

even if we can't find it.

And it turns out we can eliminate that possibility as well.

And here is how one would argue that.

So first let's think about the set, or collection,
of all possible algorithms.

All possible schemes for compressing 
a sequence of zeros and ones.

So that would be an infinite set.

There's an infinite number of possible algorithms

and we could let our program get infinitely long.

So there we have this infinite set,
this collection of all of these algorithms.

We also might think about the collection
of all possible sequences that we want to compress.

This also is an infinite set.

We're thinking about infinite sequences here,

all possible infinite sequences of zeros and ones.

That's a lot.

That's clearly infinite as well.

So we've got these two infinite sets:

all the algorithms,

and all of the sequences.

And if these two sets are the same size,

then it might be that every algorithm,

excuse me, that every sequence

has an algorithm that goes for it.

So maybe we can't find what algorithm
goes with what sequence,

but, um, we know that,
that algorithm might be out there.

However, it's not the case that there are
the same number of algorithms

as there are sequences.

In fact, there're infinitely many more sequences
than algorithms.

Thus, there are

there are for sure some sequences for which
there's no algorithm that can compress it.

And we'll call those sequences random.

So randomness exists in this argument.

Furthermore, there's so many more, um,

sequences than there are algorithms,

that if I choose a sequence, um,

I don't want to say at random.

If I choose an arbitrary sequence,

I just dip my hand into this infinite bag of sequences,

um, I'm almost certain in the sense that
with probability one

that sequence will be random.

It will be incompressible.

So the technical idea

maybe, um, some of you are familiar with this,

is that the set of all algorithms
is a countable infinity

and the set of all, um, infinite sequences
of zeros and ones

is uncountable.

So they're very different infinite, uh, infinite sets.

So again, if I choose a symbol sequence
at random,

an arbitrary symbol sequence,

I can be certain with probability one

that, um, the

that sequence is random.

OK, now let's go back to the logistic equation finally

go back to dynamical systems.

So if I choose an initial condition,

a random initial condition,

an arbitrary initial condition,

for the logistic equation,

that will produce a sequence of zeros and ones.

And I know, we saw earlier,
or I argued earlier,

that the logistic equation can 
produce every possible sequence of zeros and ones,

and all subsequences,

and they occur with equal probability.

So if I choose a,

a, um, an initial condition,

arbitrary initial condition for the logistic equation,
with R = 4,

I can be almost certain that
the outcome will be random.

Um, so, in this view,

this way of thinking about randomness,

the process producing the symbols
is definitely deterministic.

It's an unchanging deterministic rule

but the outcome is random.

That with probability one
I'm absolutely certain

that there's no algorithm that can

that can compress the sequence that 
that logistic equation produced.

So the logistic equation is producing randomness.

Or is it?

Wait a minuite.

The logistic equation is itself an algorithm
that's producing this sequence.

So how can I say that the logistic equation
is producing random outcome

because there's a

a random outcome

because there's no algorithm to, um,

to compress that sequence.

Isn't the logistic equation,
iterating logistic equation itself,

just such an algorithm?

So, um, this is a good objection,

but, um, we have to think carefully about 
the logistic equation in chaos,

and in particular sensitive dependence
on initial conditions.

So suppose you hand me a

a random sequence,

no obvious pattern to it,

and you say, OK,

my task is to figure out if there is a way

to, um, use the logistic equation to generate
this sequence of zeros and ones.

And at first I would say,

Sure, of course, there has to be.

The logistic equation produces all possible
sequences of zeros and ones.

So this sequence that you just gave me,

no problem, I can figure it out.

Or even if I can't figure it out,

maybe it's, there're too many different things to try,

initial conditions to try.

I know that an answer is out there,

and then there would be a short algorithm.

So I just compressed the sequence that
you thought was incompressible.

But wait a minute.

The key is, that in order to produce
that particular symbol sequence,

I need to specify the initial condition exactly.

So with full precision,
I need to specify the initial condition.

So I have to,
in my algorithm,

the, sort of mechanics of it,
are very simple.

Iterate the logistic equation with R = 4.0

that's a very simple program.

Many of you have written
your own versions of those programs.

Spreadsheets can do it, too.

But then, in order to make that particular outcome,

the one that you gave me,

so I can, say I've compressed it,

I also need the initial condition.

And almost all initial conditions

are going to be, um, irrational numbers.

Numbers that continue on forever.

And so, um, in order for this algorithm to work,

to reproduce this symbol sequence,

I need to specify the initial condition,

and I need to specify the initial condition exactly.

And it turns out,
by a similar argument

that almost all numbers between zero and one

are themselves random.

It's a random sequence of, of uh

decimal digits following a decimal point.

And so I need to know that number exactly,

that initial condition exactly,

to an infinite number of decimal points,

and the vast majority in the sense of all
but a set of measure zero

with probability one

initial conditions are incompressible.

So, aha, I've sort of found
a short mechanism for generating it

but I need, um, to remember an incompressible
infinite sequence of decimal digits

in order to run that algorithm

and reproduce exactly the, um, binary sequence
that's desired.

So yes, the logistic equation really is producing
a random output.

So to summarize:

the logistic equation,
a deterministic dynamical system

produces random output.

The sequence of zeros and ones,
the symbols from

the symbolic dynamics of 
the logistic equation

are incompressible.

It's a rand--
and that's what it means to be random.

It's incompressible.

And you might think,

that the logistic equation itself

is a compressed description of the sequence
it just generated.

However, in order to reproduce that sequence,

we need to specify the initial condition
to an infinite precision.

So that requires remembering
an infinitely long number.

So you, it's a short program,

but you need to give it an infinitely long number

that infinitely, uh, precise description,
of the initial condition.

So the logistic equation actually is not
a compressed version of the sequence that it produces.

So I hope this all hasn't been too confusing.

But, to be honest, I hope
it's at least a little bit confusing.

But, in a good way,

in that I think what these examples do is, um

enrich and complexify our notion
of randomness.

So, a chaotic dynamical system,

like the Lorenz equations,
or like the logistic equation with R = 4,

it's deterministic,
but it's unpredictable because of the butterfly effect.

And it's deterministic, 
but it actually produces randomness

in this algorithmic sense of randomness.

So randomness doesn't have to be
a product of chance.

Randomness can be a product
of determinism.

And that, I think, is one of the big lessons
of, uh, the study of chaos and dynamical systems.

And I'll talk a little bit more about that
in the summary for this unit,

and these are themes that we'll continue
to revisit throughout the course.