In this subunit, I'm going to go through a detailed example of a simple genetic algorithm that evolves control programs for virtual robots. Well, our virtual robot is called "Robby". He's a very simple robot who lives in a simulated world. I described him first in my book, "Complexity: A Guided Tour", and his job is to go around, collecting empty soda cans. Robbie was inspired by a real robot, called "Herbert", which operated in the 1980's, in the MIT Artificial Intelligence Lab. It went around the lab, on its wheels, going into the different offices, and collecting empty soda cans, and taking them to the recycle bin. Herbert was the work of Jonathan Connell, a graduate student at MIT working with Rod Brooks and Peter Ning, and was a very impressive robot that used only very simple algorithms to control its behaviour. Our version of "Herbert" is called "Robby", and he's a virtual robot who lives this simulated world consists of a 10 by 10 grid of squares You can think of the squares as sort of "virtual offices", and each office is either empty or it contains and empty soda can. So here's an example of a soda can. And Robby's job is to move around this world, collecting the empty soda cans. So we have to give Robby a program that tells him what to do at each step. I could program that myself, but instead I'm going to use a genetic algorithm to evolve programs to do this task. So first I need to tell you what Robby can see and what he can do. He's a very simple robot, and he has very bad vision, so the only thing he can see are the contents of his own square and the squares to the north, south east and west. So his current site, north south, east and west. In his current state, his current office or room is empty. To the north there is a wall, to the west there is a wall, to the south there is an empty square and to the east there is a can. So that's all he can see. He uses what he can see to decide which action to take at each time step. And he has seven possible actions: he can move to the north, move to the south, move to the east, move to the west, that's all moving one square. He can make a random move: choose a random direction to move in. He can stay where he is and do nothing, or he can try to pick up a can. And if he tries to pick up a can and there is a can there, he succeeds. Or he can try to pick up a can mistakenly in an empty square. And he has certain rewards and penalties that he gets for doing these things. If he picks up a can, he gets 10 points, that's a good thing to do. If he tries to pick up a can on an empty site, he gets minus one point, and if he crashes into a wall, he gets minus five points. And his overall score as he moves around is the sum of the rewards and penalties that he gets. The goal is to use a genetic algorithm to evolve a control program that is a strategy for Robby. So let me talk about what a "strategy" is. The definition that I'm going to use is a set of rules that specifies an action for every possible situation. Where the possible situations are the inputs to Robby, the possible inputs that he can see. So let's make a list of possible situations that Robby can find himself in. The possible situations would be what he sees to the north, south, east, west, his current site. And a strategy would be for each possible situation an action he would take. So here's a possible situation-- maybe the simplest one-- everything is empty. He's in an empty cell and cells in all directions are empty. That's one possible situation. Here's another possible situation: everything is empty except in his current site he has a can. And we could go on and on listing possible situations. And in a second, we'll figure out how many possible situations there are, but just to make sure that you are getting all of this, let's stop for a quick quiz. The quiz has two questions. The first says look at this situation, and suppose that Robby has a score of zero so far, and he takes some actions: move east, move east, pick up can, move east, pick up can, move south. So he takes those six actions, and you have to answer what his situation is after those six actions, and what his score is. And here's to remind you what the scoring process is.