Today, we're going to talk about how these ideas about evolution look as computations, and how we can use computation to understand and leverage evolutionary ideas. Let's first review the three basic elements of Darwinian evolution, which you've already learned about: random variation of individuals, selection of some of those individuals based on differential fitness, and the inheritance of those variations into individuals of the next generation. We also need to think a little bit about how those variations are represented, and that really gets us into the field of genetics, and we don't need to know very much about genetics. We just need to know that the information, these variations, are represented as discrete units, which today are called genes, and we need to know that the representation of the individual, the information in genes is separate from how it's expressed in the phenotype. And so we refer to this as the genotype versus phenotype mapping. And the final thing we need to know is that these genes are organized in a linear array, which today, we call a chromosome. And this understanding really started with Mendel in 1865, and the culmination of it was the Crick and Watson discovery of the structure of DNA. So we're going to take those very simple elements, and simple understandings, and translate them into a computer algorithm. And the way we're going to do that is, instead of having chromosomes with genes on them, we're going to have strings with bits. Bits are numbers that are either 0 or 1; they can only have two values. And we're going to have those strings of bits, be our individuals in the population. So, imagine that we start out - and this is in the leftmost panel of the figure - imagine that we start out with a population of randomly generated individuals or strings, and here we only have three shown, and they each only have five bits. But in real genetic algorithms, we would have more bits and larger populations. We also need a way to evaluate fitness. And we do that using a fitness function. In our example fitness function, we will assume that the values range from zero to one, and the one, the higher value, one, is the one that is better. So, the first step after - Well, the first step is generate in the initial population. Then we need to evaluate each individual in the population using our fitness function, and use those fitness values to select the next generation. And so, you see that in the center panel, where we have two copies of the highest fitness individual, no copies of the lowest fitness individual, and a single copy of the average individual. That's not very interesting because those individuals look exactly like their parents. And so we take advantage of what are known as genetic operators to introduce new variations. And we do that using mutation. That's shown in the top individual, where the first bit, a one, is flipped to become a zero. And we do this, not always in the first position, we do it randomly at different places in the string, and do it randomly throughout the strings of the population. Then we also use a process called crossover or recombination, where we take two individuals and exchange pieces of their DNA, or pieces of their bit-strings, and you see that in the second two individuals. So now, in the third panel we have the true, new generation, generation T plus one, and we need to, we then have to, repeat the cycle and evaluate those using the fitness function, do new selection, introduce new mutations and crossovers, and that is how the evolutionary process runs. This basic strategy and basic idea of using strings, but strings to represent chromosomes, was introduced by John Holland in the early 1960s. And John was very interested in the population level effects, and in the effect of impact of crossover. OK. But now coming back to the algorithm, what does it look like when we actually run this algorithm for multiple generations? So, I just showed you one generation in the previous slide, but we typically iterate these for hundreds, or even thousands, sometimes, of generations. And the way we look at it is by plotting time in terms of numbers of generations on the x-axis, and the fitness, typically, the average fitness of the population, or the best fitness of the population, those are the two values shown here on the y-axis. And so, this is a very typical kind of performance curve that we see with genetic algorithms, where the fitness of the population starts out very low initially, very quickly climbs up and improves as the really lousy individuals get deleted, and the somewhat better individuals get copied. So then, the whole average fitness moves up. Then, there's a little bit of searching around that we see, and we get another innovation, but, eventually, the population gets stuck on what's known as a plateau. This is known as punctuated equilibrium. And when that happens, then the algorithm is effectively having to explore the space more widely to find a good - a high-fitness innovation. And so, that can take a varying amount of time. And we see, we actually see, two plateaus, but the first plateau that we see in the figure, we see, eventually, a new innovation is found, and the population jumps up in fitness - and this is very typical of these genetic algorithms. Let's now talk about some applications, how they're used. And genetic algorithms have been used widely in engineering applications, and they've also been used for scientific modeling. First, we'll talk about the engineering applications. And of these, by far the most common is what's known as multi-parameter function optimization. So, that's shown in the figure, where, just for a two-dimensional function - and that is a function in two variables - and if the function is complex, a lot of times, we don't have analytical methods to find mathematically what the maximum value of the function is. And when that happens, we have to resort to sampling. And we can think of the genetic algorithm as a kind of biased sampling algorithm. And the goal, of course, would be to find in this multi-peak surface, points that are on the highest peak, up at the top. Let's just go into a little bit more detail about how that works. So, here is an example of such a function. And it's not trivial to analyze this mathematically. But suppose we want to find the x and y values for this function that produce the maximum F(x,y). So, to do that, we take our bit-strings, and conceptually think of the first half of the string as being a representation of the value of x, and the second half of the string as being a representation of the value of y. Now, to evaluate fitness, we then need to take these ones and zeros and interpret them as base two numbers, translating them into their decimal equivalents, which is shown on the figure, and then take those decimal values and plug them into our function, our fitness function. And in this case, we get the value out: 4. And so, we would have to do that for every individual in the population. I just want to say a word about where these algorithms came from. I've mentioned John Holland. There were several other groups that were interested in similar kinds of algorithms, and they, these three main groups - I've listed the first three - were all sort of independent discoveries of similar and overlapping ideas. And then, in the early 1990s, John Koza came along, and really blew open the field by showing how we could use genetic algorithms to evolve computer programs. And these separate streams of invention are now sort of lumped together, and the field is called evolutionary computation. And, needless to say, there's been a lot of recombination between these different origins.