So there's some subtleties or tricky points that we're going to have to think through regarding this idea of randomness as incompressibility. So, here is the first one that isn't too serious an issue. And that is, we need to choose some standard way of representing these compressions. So let's say that the task we give give ourselves is to compress a certain string, a sequence of symbols, and we're going to write a computer program to do that and we want to write the shortest computer program possible. So, um, I like programming in Python so, um, I might try to write a Python program and somebody else might write a program in C and somebody else might do C++ and somebody might do Lisp and who knows what else everybody would do. But we would all be working in different languages, and the algorithm would have different representations in these different languages. So we need to agree on some sort of standard or reference, um, device or language that we can use when we're comparing these algorithms. And so the standard thing that's done here in this sort of formal, um, development of these ideas, is to use something called a universal Turing machine. So a universal Turing machine is a device that, um, can do anything can sort of perform any computation, any discrete time, discrete space computation, or discrete state computation, excuse me. Um, and is capable of emulating any other computer or device that can do that computation. So it's sort of an all-encompassing computing device and it's a standard reference device, a model of computation that's as powerful as anything else. So in the formal development of these ideas which is not part of this course, um, one works in a framework where you're comparing things in terms of different universal Turing machines. So that's one subtlety that we have to address. And, OK, we've addressed that. We'll use universal Turing machines. The second subtlety is more subtle, and, uh, more serious. And that is, how do we know if we've found the shortest algorithm that can produce that can reproduce a sequence. Maybe a sequence really is compressible but I just haven't been able to figure it out. Or I'm not smart enough, or I haven't given myself enough time. Like that could be the case for pi, that certainly looked random. And by the way, in some view, in some ways of viewing it, pi really is random. All the digits, if you do a binary expansion, um, it would look, uh, the same as a fair coin. All possible sequences of zeros and ones occur equally often in the binary expansion of pi. So that's weird. So in a certain sense, pi is random. But in this algorithmic sense, or if we're using universal Turing machines, that type of algorithm makes sense. Pi isn't random, because there's a short algorithm to produce it. OK, so, um, maybe every possible sequence actually has some almost impossible to figure out short algorithm that can reproduce it, that we just haven't been smart enough to figure out yet. So it turns out we can eliminate that possibility. So let me present that argument as follows. So we're wondering if we can figure out compression algorithms, ways to compress sequences that it seems not obvious how you might be able to compress them. So one idea is, well, maybe we could come up with an algorithm that determines the shortest algorithm. An algorithm for compression that's optimal all the time. So this would be an algorithm that writes an algorithm. Or it's like a program that writes a program. And moreover, this program would automatically write the best program in the sense that it was the shortest program. The one that could compress our symbol sequence as much as possible. So, this would be an amazing thing, but one can show that such a program doesn't exist. You can't write a program that automatically writes the shortest program. So, um, so that idea, so we can't we can't sort of automatically determine if there is a shortest algorithm. But maybe, um, there is a shortest algorithm out htere, even if we can't find it. And it turns out we can eliminate that possibility as well. And here is how one would argue that. So first let's think about the set, or collection, of all possible algorithms. All possible schemes for compressing a sequence of zeros and ones. So that would be an infinite set. There's an infinite number of possible algorithms and we could let our program get infinitely long. So there we have this infinite set, this collection of all of these algorithms. We also might think about the collection of all possible sequences that we want to compress. This also is an infinite set. We're thinking about infinite sequences here, all possible infinite sequences of zeros and ones. That's a lot. That's clearly infinite as well. So we've got these two infinite sets: all the algorithms, and all of the sequences. And if these two sets are the same size, then it might be that every algorithm, excuse me, that every sequence has an algorithm that goes for it. So maybe we can't find what algorithm goes with what sequence, but, um, we know that, that algorithm might be out there. However, it's not the case that there are the same number of algorithms as there are sequences. In fact, there're infinitely many more sequences than algorithms. Thus, there are there are for sure some sequences for which there's no algorithm that can compress it. And we'll call those sequences random. So randomness exists in this argument. Furthermore, there's so many more, um, sequences than there are algorithms, that if I choose a sequence, um, I don't want to say at random. If I choose an arbitrary sequence, I just dip my hand into this infinite bag of sequences, um, I'm almost certain in the sense that with probability one that sequence will be random. It will be incompressible. So the technical idea maybe, um, some of you are familiar with this, is that the set of all algorithms is a countable infinity and the set of all, um, infinite sequences of zeros and ones is uncountable. So they're very different infinite, uh, infinite sets. So again, if I choose a symbol sequence at random, an arbitrary symbol sequence, I can be certain with probability one that, um, the that sequence is random. OK, now let's go back to the logistic equation finally go back to dynamical systems. So if I choose an initial condition, a random initial condition, an arbitrary initial condition, for the logistic equation, that will produce a sequence of zeros and ones. And I know, we saw earlier, or I argued earlier, that the logistic equation can produce every possible sequence of zeros and ones, and all subsequences, and they occur with equal probability. So if I choose a, a, um, an initial condition, arbitrary initial condition for the logistic equation, with R = 4, I can be almost certain that the outcome will be random. Um, so, in this view, this way of thinking about randomness, the process producing the symbols is definitely deterministic. It's an unchanging deterministic rule but the outcome is random. That with probability one I'm absolutely certain that there's no algorithm that can that can compress the sequence that that logistic equation produced. So the logistic equation is producing randomness. Or is it? Wait a minuite. The logistic equation is itself an algorithm that's producing this sequence. So how can I say that the logistic equation is producing random outcome because there's a a random outcome because there's no algorithm to, um, to compress that sequence. Isn't the logistic equation, iterating logistic equation itself, just such an algorithm? So, um, this is a good objection, but, um, we have to think carefully about the logistic equation in chaos, and in particular sensitive dependence on initial conditions. So suppose you hand me a a random sequence, no obvious pattern to it, and you say, OK, my task is to figure out if there is a way to, um, use the logistic equation to generate this sequence of zeros and ones. And at first I would say, Sure, of course, there has to be. The logistic equation produces all possible sequences of zeros and ones. So this sequence that you just gave me, no problem, I can figure it out. Or even if I can't figure it out, maybe it's, there're too many different things to try, initial conditions to try. I know that an answer is out there, and then there would be a short algorithm. So I just compressed the sequence that you thought was incompressible. But wait a minute. The key is, that in order to produce that particular symbol sequence, I need to specify the initial condition exactly. So with full precision, I need to specify the initial condition. So I have to, in my algorithm, the, sort of mechanics of it, are very simple. Iterate the logistic equation with R = 4.0 that's a very simple program. Many of you have written your own versions of those programs. Spreadsheets can do it, too. But then, in order to make that particular outcome, the one that you gave me, so I can, say I've compressed it, I also need the initial condition. And almost all initial conditions are going to be, um, irrational numbers. Numbers that continue on forever. And so, um, in order for this algorithm to work, to reproduce this symbol sequence, I need to specify the initial condition, and I need to specify the initial condition exactly. And it turns out, by a similar argument that almost all numbers between zero and one are themselves random. It's a random sequence of, of uh decimal digits following a decimal point. And so I need to know that number exactly, that initial condition exactly, to an infinite number of decimal points, and the vast majority in the sense of all but a set of measure zero with probability one initial conditions are incompressible. So, aha, I've sort of found a short mechanism for generating it but I need, um, to remember an incompressible infinite sequence of decimal digits in order to run that algorithm and reproduce exactly the, um, binary sequence that's desired. So yes, the logistic equation really is producing a random output. So to summarize: the logistic equation, a deterministic dynamical system produces random output. The sequence of zeros and ones, the symbols from the symbolic dynamics of the logistic equation are incompressible. It's a rand-- and that's what it means to be random. It's incompressible. And you might think, that the logistic equation itself is a compressed description of the sequence it just generated. However, in order to reproduce that sequence, we need to specify the initial condition to an infinite precision. So that requires remembering an infinitely long number. So you, it's a short program, but you need to give it an infinitely long number that infinitely, uh, precise description, of the initial condition. So the logistic equation actually is not a compressed version of the sequence that it produces. So I hope this all hasn't been too confusing. But, to be honest, I hope it's at least a little bit confusing. But, in a good way, in that I think what these examples do is, um enrich and complexify our notion of randomness. So, a chaotic dynamical system, like the Lorenz equations, or like the logistic equation with R = 4, it's deterministic, but it's unpredictable because of the butterfly effect. And it's deterministic, but it actually produces randomness in this algorithmic sense of randomness. So randomness doesn't have to be a product of chance. Randomness can be a product of determinism. And that, I think, is one of the big lessons of, uh, the study of chaos and dynamical systems. And I'll talk a little bit more about that in the summary for this unit, and these are themes that we'll continue to revisit throughout the course.