[Elizabeth Bradley:] Okay... So, I would like to introduce Prof. Dr. Holger Kantz, who will be our guide on this particular field trip. And... my colleague Holger works at the Max-Planck-Institut für Physik komplexer Systeme in Dresden-- [Holger Kantz:] That's right. [Bradley:] And, he is going to be telling us a little bit about a time series analysis application for the prediction of extreme events. So, thank you for joining us Holger. [Kantz:] It's my pleasure. Yeah, extreme events are an omnipresent phenomenon in our world, and we are speaking here about extreme deviations from normal behavior and systems which exhibit natural fluctuations. So - like we see in many atmospheric phenomena, as we see also in the stock market, as we see in other man-made systems. And, of course, dynamical sytems - what is the object of this class here - is a source for fluctuations, for endogenous fluctuations - and from time to time, there might be some extraordinarily large fluctuation, deviating from - say - the mean value, which we can observe when we study the system for a long time. And, there are many issues about extreme events. For instance, there are magnitude distribution [issues]. How often do we encounter - say - an earthquake of magnitude bigger than eight, or some strong storm with high wind speeds? And, there are also the medical issues - like - can we predict the occurrence of the next extreme in a particular setting? And, this is where one can try to use methods from time series analysis. The main problem with predicting extremes in all of these systems is that the systems are way too complex to show the deterministic behavior in a kind of embedding space. So, for instance, in the atmosphere, we are used to modeling the atmosphere by deterministic computer programs - with so-called "general circulation models," which typically are deterministic and exhibit all the phenomena which you know from the real atmosphere. But, of course, when we record the time series from such systems, or even from the model system, and we try an embedding in the way of time delay embedding, we would never see the deterministic properties. [Bradley:] What would you see instead? You'd just see a big ball of hair? [Kantz:] Yeah, when you just plot the trajectory - say - in a two-dimensional representation, you would see some block of curves, which somehow intersect each other in a very complicated way. You would never see a nice attractor. And so, is there any hope - nonetheless - to make use of these methods? And, there is something which somehow rescues us, and that's the fact that most of these systems have only short-range interactions. So, if I want to understand the dynamics of the change of the behavior of a little air packet, then what is, in the first place, responsible for the changes is this little air packet itself, and then the next neighbors, which are somehow around that - only in a not so intense way. It depends on global quantities, like the air pressure - I'd say in China or something like that... to come to this standard example of the butterfly effect, where people say - if there is some little event in China, it might cause a thunderstorm, or a tornado or whatever, in another part of the world. Of course, this is a nice example, but it's not perfectly true - because the interactions, first of all, are short-range, and only after a long while, such a little perturbation at some other place in the world will propagate to the place I am looking at. And so, if we now understand the surrounding degrees of freedom as those which are essentially driving our degree of freedom under investigation - and we certainly assume that the issues of further away use of freedom is less and less relevant the [further] they are away - then we can imagine that we can gain a kind of approximate embedding so that we can assume that there is a mixture of a kind of determinism plus some stochastic noises, which represent all the degrees of freedom which we are not able to resolve in our local embedding. And, this is the idea which we applied to the prediction of wind gusts, and it is illustrated in the first figure. This is wind speed data, measured with some little anemometer at some 20 to 30 meters above ground, with a sampling rate of 8 Hz - so eight data points cover one second. It's pretty high time resolution, and these are just atmospheric wind speeds. I have to explain, first of all, the three axes. So, the one which points into the background is a time axis, labeled by "n." And, "0" means the present - and the negative times are the past, and the positive times are the future. And, the blue fence-like structure is the time series, which until 0, from negative to 0, is representing our present signal - and its future is what we want to predict. The x-axis is the wind speeds. What is in the bottom plane is a bunch of other time series - the red ones, which are close to the blue one in this time window. And, this is exactly what one would pick out as a neighborhood in a time-delay embedding space covering these ten delay values, because each of the red's individual coordinates is close to the corresponding blue coordinate at the corresponding time index. It is like a tube around the blue curve, and every red curve which is shown is within this tube in the whole time interval from 0 to -10. So, these are the neighbors in the time-delay embedding space, and if these were trajectories representing a low-dimensional deterministic system, then, of course, this red bunch of trajectories should follow closely the blue line, apart from the fact that it is allowed to diverge from it exponentially fast, due to the positive level of exponents which are assumed to be present in a chaotic system. But, this divergence would be quite smooth, would be regular - the trajectories would not intersect themselves as much as is shown here in this red bunch. Indeed, the way trajectories are - the best model for these would be a stochastic model. Now, what is the green histogram in the background? We can count how many of the red trajectories have which value of the maximum in some binning and the histogram, And, we get the probability distribution for those trajectories, which have exactly the same profile as the blue one in the past time window, to exhibit the wind gust in the future. So, if we put a threshold - say - at plus 10m, it identifies the very right part of this green histogram. and that maybe is something like 10 out of 50 or something. So, the probability for this blue curve to exhibit the wind gusts would be 10 over 50 - so would be 20%. Of course, in this case, because it's a probabilistic approach, the blue trajectory is allowed to do something completely different. Indeed, in this example, the blue trajectory goes to lower wind speeds - so predicting that there would be a gust would be a wrong prediction in this particular example. So, the question now is - the really interesting question is - if the blue trajectory had a different profile in this time window from -10 to 0, would then the bunch of neighboring trajectories generate a grossly different histogram for the future? Or, will the histogram for the future be always the same? Because, there's always this 20% chance to have the wind gust within the next two seconds. This is shown in Figure 2, where we have four histograms in this single figure - two red ones and two blue ones. And, there are two little arrows underneath - a red one and a blue one - and these two arrows indicate the wind speed at time 0, so to say, the current wind speed - and it's almost the same. In this case, the first and the last ten sampling intervals is completely different for the red arrow and the blue arrow. So, that means the delay vectors in embedding space share... The very present components have almost the same value in the present, but have quite different values for the components in the past, so that the neighborhood of these two delay vectors will sample a quite different region of phase space. And, the future values, in terms of these future wind speeds of the neighbors, is represented in these histograms one and two seconds ahead. So, the first - the lower histograms - is one second ahead, and the upper two histograms are shifted, and they are two seconds ahead. We see that, for the red arrow, the histograms are more or less distributed with a mean value, which is close to the red arrow. So, in this case, the wind speed has the same chance to grow and to decrease in the near future, whereas, in the case of the blue arrow, the histograms are shifted to the right - that means to higher wind speeds. So, it means we have identified that, in this case indeed... and regardless of the current value of the wind speeds - taking into account an embedding that meets the past profile determines the probability to expect wind gusts or not to expect wind gusts. So, one can make use of these probabilities in the future, to have a wind gust in the next 1-2 seconds, in order to design or to construct a prediction scheme from this. I said that the bigger the event's magnitude - in this case, the gust value - the less events we have in the data sets, because the big events are rare. The really extreme events are very rare. So, even if we have a hit rate of 80%, if we have an event which is so rare that it occurs only once in a month so which would be really a strong gust, for instance, which could damage a wind farm or wind turbines - then we have the chance to have 80% of these two events per month correctly predicted. But, it means that all the other events which you have in our recordings are nonevents. And, if we had a false alarm rate which is even less than a percent... of almost everything, [it] is still a much bigger number than these two hits. This shows that sometimes such a prediction scheme might be worthless, because we have to consider the costs also of the false alarms - and the cost of false alarms might be quite different. But, for instance, if we have a fire alarm in our office - here in the office building - and if the fire bell is ringing too often, nobody takes it seriously anymore. Therefore, the fire alarm is worthless and it shows if... it has too many false alarms of course. If it were burning every day, we were still obeying... So, it shows that... I mean this also is a remark on how one can tune statistical results so that they look very favorable, although the result in reality is not as nice as one has presented. So, this was a short outline of how to predict extreme events in a probabilistic way, making use of embedding techniques. [Bradley:] Great. So, I really appreciate you telling us about this. And, I wanted to remind the class that Professor Kantz is the author of one of the textbooks for this course - what I consider the Bible of nonlinear time series analysis. So, you all are already familiar with his words - and now you're familiar with his face. So, thank you very much, Holger. [Kantz:] Thank you. [Bradley:] Okay...