Today, we're going to talk about

how these ideas about evolution
look as computations,

and how we can use computation

to understand and leverage
evolutionary ideas.

Let's first review

the three basic elements
of Darwinian evolution,

which you've already learned about:

random variation of individuals,

selection of some of those individuals
based on differential fitness,

and the inheritance of those variations
into individuals of the next generation.

We also need to think a little bit about
how those variations are represented,

and that really gets us
into the field of genetics,

and we don't need to know
very much about genetics.

We just need to know

that the information, these variations,
are represented as discrete units,

which today are called genes,

and we need to know

that the representation of the individual,
the information in genes

is separate from how it's
expressed in the phenotype.

And so we refer to this as the genotype
versus phenotype mapping.

And the final thing we need to know

is that these genes
are organized in a linear array,

which today, we call a chromosome.

And this understanding
really started with Mendel in 1865,

and the culmination of it

was the Crick and Watson discovery
of the structure of DNA.

So we're going to take
those very simple elements,

and simple understandings,

and translate them
into a computer algorithm.

And the way we're going to do that is,

instead of having chromosomes
with genes on them,

we're going to have strings with bits.

Bits are numbers that are either 0 or 1;

they can only have two values.

And we're going to have
those strings of bits,

be our individuals in the population.

So, imagine that we start out -

and this is in the leftmost
panel of the figure -

imagine that we start out

with a population
of randomly generated individuals

or strings,

and here we only have three shown,
and they each only have five bits.

But in real genetic algorithms,

we would  have more bits
and larger populations.

We also need a way to evaluate fitness.

And we do that using a fitness function.

In our example fitness function,

we will assume that the values
range from zero to one,

and the one, the higher value, one,
is the one that is better.

So, the first step after -

Well, the first step is
generate in the initial population.

Then we need to evaluate
each individual in the population

using our fitness function,

and use those fitness values
to select the next generation.

And so, you see that in the center panel,

where we have two copies
of the highest fitness individual,

no copies of the lowest
fitness individual,

and a single copy
of the average individual.

That's not very interesting

because those individuals
look exactly like their parents.

And so we take advantage
of what are known as genetic operators

to introduce new variations.

And we do that using mutation.

That's shown in the top individual,

where the first bit, a one,
is flipped to become a zero.

And we do this,
not always in the first position,

we do it randomly
at different places in the string,

and do it randomly throughout
the strings of the population.

Then we also use a process
called crossover or recombination,

where we take two individuals
and exchange pieces of their DNA,

or pieces of their bit-strings,

and you see that
in the second two individuals.

So now, in the third panel

we have the true, new generation,

generation T plus one,

and we need to, we then have to,

repeat the cycle and evaluate those
using the fitness function,

do new selection,

introduce new mutations and crossovers,

and that is how
the evolutionary process runs.

This basic strategy
and basic idea of using strings,

but strings to represent chromosomes,

was introduced by John Holland
in the early 1960s.

And John was very interested in
the population level effects,

and in the effect of impact of crossover.

OK.

But now coming back to the algorithm,

what does it look like

when we actually run this algorithm
for multiple generations?

So, I just showed you one generation
in the previous slide,

but we typically
iterate these for hundreds,

or even thousands,
sometimes, of generations.

And the way we look at it

is by plotting time in terms
of numbers of generations on the x-axis,

and the fitness,

typically, the average
fitness of the population,

or the best fitness of the  population,

those are the two values
shown here on the y-axis.

And so, this is a very typical
kind of performance curve

that we see with genetic algorithms,

where the fitness of the population
starts out very low initially,

very quickly climbs up and improves

as the really lousy
individuals get deleted,

and the somewhat better
individuals get copied.

So then, the whole
average fitness moves up.

Then, there's a little bit
of searching around that we see,

and we get another innovation,

but, eventually, the population
gets stuck on what's known as a plateau.

This is known as punctuated equilibrium.

And when that happens,

then the algorithm is effectively
having to explore the space more widely

to find a good -
a high-fitness innovation.

And so, that can take
a varying amount of time.

And we see, we actually see, two plateaus,

but the first plateau
that we see in the figure,

we see, eventually,
a new innovation is found,

and the population jumps up in fitness -

and this is very typical
of these genetic algorithms.

Let's now talk about some applications,

how they're used.

And genetic algorithms have been
used widely in engineering applications,

and they've also been used
for scientific modeling.

First, we'll talk about
the engineering applications.

And of these, by far the most common

is what's known as multi-parameter
function optimization.

So, that's shown in the figure,

where, just for
a two-dimensional function -

and that is a function in two variables -

and if the function is complex,

a lot of times, we don't
have analytical methods

to find mathematically
what the maximum value of the function is.

And when that happens,
we have to resort to sampling.

And we can think of the genetic algorithm
as a kind of biased sampling algorithm.

And the goal, of course, would be
to find in this multi-peak surface,

points that are on the highest peak,

up at the top.

Let's just go into a little bit
more detail about how that works.

So, here is an example of such a function.

And it's not trivial
to analyze this mathematically.

But suppose we want to find

the x and y values for this function
that produce the maximum F(x,y).

So, to do that, we take our bit-strings,

and conceptually think of
the first half of the string

as being a representation
of the value of x,

and the second half of the string

as being a representation
of the value of y.

Now, to evaluate fitness,

we then need to take these ones and zeros
and interpret them as base two numbers,

translating them
into their decimal equivalents,

which is shown on the figure,

and then take those decimal values
and plug them into our function,

our fitness function.

And in this case, we get the value out: 4.

And so, we would have to do that
for every individual in the population.

I just want to say a word
about where these algorithms came from.

I've mentioned John Holland.

There were several other groups

that were interested in
similar kinds of algorithms,

and they, these three main groups -

I've listed the first three -

were all sort of independent discoveries
of similar and overlapping ideas.

And then, in the early 1990s,

John Koza came along,
and really blew open the field

by showing how we could
use genetic algorithms

to evolve computer programs.

And these separate streams of invention

are now sort of lumped together,

and the field is called
evolutionary computation.

And, needless to say,

there's been a lot of recombination
between these different origins.