In biology, there is nothing more fundamental
or basic than the cell.

A Turing machine is in a sense fundamental
in computer science, but it is also different

than the role of a cell in biology, it is
more like the meter in the international metric

system, the meter is a basic unit and it is
fundamental in the practice of science and

engineering but there is also an arbitrary
aspect to it, the length of a meter could

have been smaller or larger as long as we
agreed on using it as a standard.

So a Turing machine is in some way similar
to the meter so I am going to explain in what

way it is fundamental and in what way it is
not, and both results come actually directly

from the original work of Alan Turing who
gave the name to these machines.

The Turing machine idea came actually from
a fundamental question in science asked by

famous mathematician David Hilbert at the
Sorbonne, where I studied myself but more

than a hundred years before.

Hilbert was somehow certain that there should
be a way to automatise mathematics in such

a way that every possible formula could be
proven to be either a true or false in a theory.

In other words, he was proposing, although
not explicitly, to build some sort of machine

that could basically produce all past and
future theorems.

In other words, he asked What is the Power
of Mechanical Reasoning, either Human or Machine

like.

It took some time to get some answers but
eventually they came by way of mathematicians

Kurt Gödel and Alan Turing, among other contributors.

Gödel's answer was in the negative, proving
the there was no such a way to derive all

possible truths from a theory without getting
into contradictions.

Alan Turing gave the same answer in the negative
by showing that if one assumes that one can

construct a machine to prove all mathematical
theorems then one gets into a contradiction

as we will see later.

But in providing the negative answer to Hilbert’s
question, and unlike Gödel’s answer, in

Turing’s there was a surprising secondary
positive result that had the amazing result

to start a whole new field of science and
laid the foundations of what we know today

as computer science.

So the Turing machine model has been very
successful because it has a very simple mechanical

description.

One can think of a Turing machine of some
sort of actual device consisting a head reading

and rewriting the contents of a tape and performing
different operations that we call states.

The head can move one side and the other according
to some basic rules.

But to better understand and work with Turing
machines one has to deal with more formal

details.

So another way to describe a Turing machine
is as a collection of 5 distinguished elements,

what it is known as a 5-tuple consisting of
a finite set of states Q including 2 special

states that will denote an initial state or
configuration of the machine when it starts

and one when it finishes or halts.

A set Sigma of symbols that are either written
on the tape or the head can write on the tape.

And a transition function that we call delta
that indicates how the machine will behave

depending on the symbols that it is reading
from the tape and the current state of the

Turing machine.

So let’s see how it works.

Let’s denote by Sigma star the set of all
possible binary strings, you know strings

like the one on your screen, a sequence of
0s and 1s.

We will define a formal language L that uses
the symbols in an alphabet Sigma, in this

case, binary.

Then L is simply a subset of Sigma star, is
like English, on the one hand, you can have

all possible words using all the letters in
the English alphabet but not all of them are

English words, only a subset of them.

So let’s assume that the words that we want
to recognise are all those that start with

1 followed by a bunch of 0s and ending with
a 1.

So for example, 0111 is not a word in our
language L because it is not a word that starts

with 1 followed by at least a 0 as it is required.

So we want to build a Turing machine that
recognises only words in our artificial language

L.

For that we will use a notation that resembles
something like a flow diagram that takes care

of all cases in a very visual fashion.

We can immediately recognise a few elements
of a Turing machine, we said that there were

2 distinguished states among all, the initial
one, in this case, A and the final one, in

this case, the Accept state.

Here we also have another one that I call
Reject that denotes the state of the machine

when the input word is not a word in our language
L.

Then there has to be a 1 on the tape for us
to move to State B, if more 0s come in the

tape we remain in the same state but keep
moving to the right until there is a 1 and

if nothing else comes, i.e. an empty cell
on the tape follow, then we accept the word

as part of the language L and halt the machine.

In all other cases if anything else comes
different to a 0 or 1 then it goes to the

Reject state.

And this is how we have written and constructed
a Turing machine that recognises the language

L.

Notice that this machine represented by this
state diagram is completely deterministic,

there are no loose ends of something that
could come in the tape or other cases for

which there is ambiguity, every case is properly
covered and this is why this particular Turing

machine is said to be deterministic.

Now, it is 
also interesting to see that this state diagram

is very similar to the state diagram of other
type of machines, such as finite automata,

but the key component in this diagram is that
the rules have a head movement as it is reading

the contents of a tape, and this memory is
what makes the Turing machine model so powerful.

And indeed, it is the tape that makes all
the difference with other models of computing

less powerful than the Turing machine.

If you limit the tape, or limit the head movement,
or remove the tape altogether, it gives you

machines of different power that build a hierarchy
that is better known by the name Chomsky hierarchy,

yes, the same Noam Chomsky that is a political
analyst but also one of the most important

linguists.

So, for example, even though a state diagram
of a machine with no tape would look almost

the same and can recognise our previous languages
L, only Turing machines can do certain things,

such as recognising these languages that are
slightly more difficult than our languages.

So let’s do another example of a Turing
machine but using slightly different tools.

This time, the idea is to recognise all the
words that start with a number of 0s followed

by the same number of 1s and nothing else.

So this word on the tape with 4 zeros and
4 ones would be an example of a word in our

new language L.

Now the strategy to accept this kind of words
can be described in what it is called in modern

computer science as a pseudocode, that is
a code in mostly natural language that can

be translated to any other specific computer
language.

The first pseudocode was actually written
by Turing himself in his original Turing machine

paper in 1936.

So the intuitive idea to construct a machine
that accepts these words with the same number

of 0s and 1s is to replace the first zero
with a marker, then move to the next 1 and

mark it with another symbol as well, then
we repeat the process until no more 0s and

at the end if for every new symbol for 0 matches
another symbol for 1, then the word is accepted

and if not then rejected.

We can see this in a more visual way that
it is very useful to analyse the behaviour

of Turing machines.

It is called a space-time diagram.

So we start with the input word, in this case,
the word with 4 zeros followed by 4 ones.

Then by convention the machine always start
in State 1, or State A as in the previous

example.

Then as we said, we mark every 0 with an X
and every 1 with a Y and we repeat the process

going back and forth until no more 0s remain.

You can see how the history of the computation,
which is the content of the Turing machine

tape over time looks like.

At the end, the last row represents the output
of the machine and when they are all matched

we Accept the word.

And we can use colours to see it more clearly.

So this 
is how a Turing machine as a function can

be visualised for a specific input.

So let’s also denote the head location by
this arrow.

One can track down the behaviour of the Turing
machine over time for this rule.

You see, the rule tells that if there is a
blank and the machine is in state 1, then

it leaves the blank and remains in the same
state and moves to the left by using now -1

to denote the left.

So this machine enters into an infinite computation
with no end and keeps moving the head to the

left forever without changing any contents
of the original tape.

Now you help me to hack the code of this Turing
machine by looking at its behaviour if we

assume that the number of positions of the
head is the number of states.

We can see that there are 2 states then because
we have the arrow pointing up and down.

And when the head is in state 1 with the arrow
pointing up and there is nothing on the tape,

then it leaves the blank, moves to the right
and changes to state 2.

Then on state 2 and a blank on the tape, it
prints a blue square and moves to the right

and keeps doing the same because it remains
in state 2.

So we can now know what the mapping rule and
source code of the Turing machine is.

The previous Turing machines were quite simple
but not all them are the same.

This is an example of a much more sophisticated
even when it has only 3 symbols or colours

and 2 states.

On your right you have the behaviour of this
very simple Turing machine and on your left

you have what is called a compressed version
of the same space-time diagram only keeping

the rows where the head went further to the
right.

And it actually looks like another model of
computation that you may or may not know that

is called a cellular automaton but that is
a topic of another lecture.

I encourage you to try to write your own Turing
machine.

For example, you can try to build a copy-machine
that given any input, the machine reads the

input and writes it again after the input,
that is it repeats the input twice on the

tape so it makes a copy of the input after
halting.

The best way to understand a subject is trying
to make a program by yourself.