[Elizabeth Bradley:] Okay...
So, I would like to introduce
Prof. Dr. Holger Kantz,
who will be our guide
on this particular field trip.
And... my colleague Holger works at
the Max-Planck-Institut
für Physik komplexer Systeme in Dresden--
[Holger Kantz:] That's right.
[Bradley:] And, he is going to be
telling us a little bit about
a time series analysis application
for the prediction of extreme events.
So, thank you for joining us Holger.
[Kantz:] It's my pleasure.
Yeah, extreme events are an omnipresent
phenomenon in our world,
and we are speaking here about extreme
deviations from normal behavior
and systems which exhibit
natural fluctuations.
So - like we see
in many atmospheric phenomena,
as we see also in the stock market,
as we see in other man-made systems.
And, of course, dynamical sytems -
what is the object of this class here -
is a source for fluctuations,
for endogenous fluctuations -
and from time to time,
there might be some
extraordinarily large fluctuation,
deviating from - say - the mean value,
which we can observe
when we study the system for a long time.
And, there are many issues
about extreme events.
For instance, there are magnitude
distribution [issues].
How often do we encounter - say -
an earthquake of magnitude
bigger than eight,
or some strong storm
with high wind speeds?
And, there are also
the medical issues - like -
can we predict the occurrence of
the next extreme in a particular setting?
And, this is where one can try to use
methods from time series analysis.
The main problem with predicting extremes
in all of these systems
is that the systems are way too complex
to show the deterministic behavior
in a kind of embedding space.
So, for instance, in the atmosphere,
we are used to modeling the atmosphere
by deterministic computer programs -
with so-called
"general circulation models,"
which typically are deterministic
and exhibit all the phenomena
which you know from the real atmosphere.
But, of course, when we record
the time series from such systems,
or even from the model system,
and we try an embedding
in the way of time delay embedding,
we would never see
the deterministic properties.
[Bradley:] What would you see instead?
You'd just see a big ball of hair?
[Kantz:] Yeah, when you just plot
the trajectory - say -
in a two-dimensional representation,
you would see some block of curves,
which somehow intersect each other
in a very complicated way.
You would never see a nice attractor.
And so, is there any hope - nonetheless -
to make use of these methods?
And, there is something
which somehow rescues us,
and that's the fact
that most of these systems
have only short-range interactions.
So, if I want to understand
the dynamics
of the change of the behavior
of a little air packet,
then what is, in the first place,
responsible for the changes
is this little air packet itself,
and then the next neighbors,
which are somehow around that -
only in a not so intense way.
It depends on global quantities,
like the air pressure -
I'd say in China or something like that...
to come to this standard example
of the butterfly effect,
where people say -
if there is some little event in China,
it might cause a thunderstorm,
or a tornado or whatever,
in another part of the world.
Of course, this is a nice example,
but it's not perfectly true -
because the interactions, first of all,
are short-range,
and only after a long while,
such a little perturbation
at some other place in the world
will propagate to the place
I am looking at.
And so, if we now understand
the surrounding degrees of freedom
as those which are essentially driving
our degree of freedom
under investigation -
and we certainly assume that the
issues of further away use of freedom
is less and less relevant
the [further] they are away -
then we can imagine that we can gain
a kind of approximate embedding
so that we can assume that there is
a mixture of a kind of determinism
plus some stochastic noises,
which represent all the degrees of freedom
which we are not able to resolve
in our local embedding.
And, this is the idea which we applied to
the prediction of wind gusts,
and it is illustrated in the first figure.
This is wind speed data,
measured with some little anemometer
at some 20 to 30 meters above ground,
with a sampling rate of 8 Hz -
so eight data points cover one second.
It's pretty high time resolution,
and these are just
atmospheric wind speeds.
I have to explain, first of all,
the three axes.
So, the one which points
into the background
is a time axis, labeled by "n."
And, "0" means the present -
and the negative times are the past,
and the positive times are the future.
And, the blue fence-like structure
is the time series,
which until 0, from negative to 0,
is representing our present signal -
and its future is what we want to predict.
The x-axis is the wind speeds.
What is in the bottom plane
is a bunch of other time series -
the red ones, which are close
to the blue one in this time window.
And, this is exactly what one would
pick out as a neighborhood
in a time-delay embedding space
covering these ten delay values,
because each of the red's
individual coordinates
is close to the corresponding
blue coordinate
at the corresponding time index.
It is like a tube around the blue curve,
and every red curve which is shown
is within this tube
in the whole time interval from 0 to -10.
So, these are the neighbors
in the time-delay embedding space,
and if these were trajectories
representing a low-dimensional
deterministic system,
then, of course,
this red bunch of trajectories
should follow closely the blue line,
apart from the fact that it is allowed
to diverge from it exponentially fast,
due to the positive level of exponents
which are assumed to be present
in a chaotic system.
But, this divergence
would be quite smooth,
would be regular - the trajectories
would not intersect themselves
as much as is shown here
in this red bunch.
Indeed, the way trajectories are -
the best model for these
would be a stochastic model.
Now, what is the green histogram
in the background?
We can count how many
of the red trajectories
have which value of the maximum
in some binning and the histogram,
And, we get the probability distribution
for those trajectories,
which have exactly the same profile
as the blue one in the past time window,
to exhibit the wind gust in the future.
So, if we put a threshold - say -
at plus 10m,
it identifies the very right part
of this green histogram.
and that maybe is something
like 10 out of 50 or something.
So, the probability for this blue curve
to exhibit the wind gusts
would be 10 over 50 -
so would be 20%.
Of course, in this case,
because it's a probabilistic approach,
the blue trajectory is allowed to do
something completely different.
Indeed, in this example,
the blue trajectory
goes to lower wind speeds -
so predicting that there would be a gust
would be a wrong prediction
in this particular example.
So, the question now is -
the really interesting question is -
if the blue trajectory had a different
profile in this time window from -10 to 0,
would then the bunch of neighboring
trajectories generate
a grossly different histogram
for the future?
Or, will the histogram for the future
be always the same?
Because, there's always this 20% chance
to have the wind gust
within the next two seconds.
This is shown in Figure 2,
where we have four histograms
in this single figure -
two red ones and two blue ones.
And, there are two little arrows
underneath - a red one and a blue one -
and these two arrows indicate
the wind speed at time 0,
so to say, the current wind speed -
and it's almost the same.
In this case, the first and the last
ten sampling intervals
is completely different
for the red arrow and the blue arrow.
So, that means the delay vectors
in embedding space share...
The very present components have almost
the same value in the present,
but have quite different values
for the components in the past,
so that the neighborhood
of these two delay vectors
will sample a quite different region
of phase space.
And, the future values, in terms of these
future wind speeds of the neighbors,
is represented in these histograms
one and two seconds ahead.
So, the first - the lower histograms -
is one second ahead,
and the upper two histograms are shifted,
and they are two seconds ahead.
We see that, for the red arrow,
the histograms are more or less
distributed with a mean value,
which is close to the red arrow.
So, in this case,
the wind speed has the same chance to grow
and to decrease in the near future,
whereas, in the case of the blue arrow,
the histograms are shifted to the right -
that means to higher wind speeds.
So, it means we have identified that,
in this case indeed...
and regardless of the current value
of the wind speeds -
taking into account an embedding
that meets the past profile
determines the probability to expect
wind gusts or not to expect wind gusts.
So, one can make use
of these probabilities in the future,
to have a wind gust
in the next 1-2 seconds,
in order to design or to construct
a prediction scheme from this.
I said that the bigger
the event's magnitude -
in this case, the gust value -
the less events we have in the data sets,
because the big events are rare.
The really extreme events
are very rare.
So, even if we have a hit rate of 80%,
if we have an event which is so rare
that it occurs only once in a month
so which would be really a strong gust,
for instance,
which could damage a wind farm
or wind turbines -
then we have the chance to have 80%
of these two events per month
correctly predicted.
But, it means that all the other events
which you have in our recordings
are nonevents.
And, if we had a false alarm rate
which is even less than a percent...
of almost everything,
[it] is still a much bigger number
than these two hits.
This shows that sometimes such
a prediction scheme might be worthless,
because we have to consider
the costs also of the false alarms -
and the cost of false alarms
might be quite different.
But, for instance,
if we have a fire alarm in our office -
here in the office building -
and if the fire bell is ringing too often,
nobody takes it seriously anymore.
Therefore, the fire alarm is worthless
and it shows if...
it has too many false alarms of course.
If it were burning every day,
we were still obeying...
So, it shows that...
I mean this also is a remark on how
one can tune statistical results
so that they look very favorable,
although the result in reality
is not as nice as one has presented.
So, this was a short outline
of how to predict extreme events
in a probabilistic way,
making use of embedding techniques.
[Bradley:] Great.
So, I really appreciate
you telling us about this.
And, I wanted to remind the class
that Professor Kantz is the author
of one of the textbooks for this course -
what I consider the Bible
of nonlinear time series analysis.
So, you all are already familiar
with his words -
and now you're familiar with his face.
So, thank you very much, Holger.
[Kantz:] Thank you.
[Bradley:] Okay...