1
00:00:03,000 --> 00:00:07,000
The first questions asks which paper originally proposed delay-coordinate embedding
2
00:00:07,000 --> 00:00:12,000
And it was the Packard et al. paper entitled Geometry from a time series
3
00:00:12,000 --> 00:00:15,000
Technically, this paper didnt formally prove anything about delay-coordinate embedding
4
00:00:15,000 --> 00:00:21,000
It actually just mentioned delay-coordinate embedding as an aside; that this might be something people might want to look into, and that it might be useful
5
00:00:21,000 --> 00:00:23,000
But they proposed a slightly different method
6
00:00:23,000 --> 00:00:29,000
However, this was the first paper to bring up that the delay-coordinate embedding process could be useful in reconstructing an attractor
7
00:00:29,000 --> 00:00:35,000
The Sauer et al. 1991 paper entitled Embedology really extends and improves on Takens theorem
8
00:00:35,000 --> 00:00:42,000
This is the paper that relaxed the embedding restriction from m greater than two dimensions to m greater than two times the capacity dimension
9
00:00:42,000 --> 00:00:47,000
It also provided several other theoretical proofs and discussions that were necessary
10
00:00:47,000 --> 00:00:54,000
The Kennel et al. paper entitled Determining embedding dimension using a geometrical construction is the original false-nearest-neighbor paper
11
00:00:54,000 --> 00:01:01,000
And Kantz and Schreibers 1997 book entitled Nonlinear time-series analysis is a very general textbook on nonlinear time-series analysis
12
00:01:01,000 --> 00:01:10,000
For Question 2, The delay-coordinate embedding machinery only works, in both theory and practice, if you have an infinite amount of noise-free data, and fortunately, this is false
13
00:01:10,000 --> 00:01:17,000
If this were true, as we can never have an infinite amount of data, let alone noise-free data, we could never use delay-coordinate embedding in practice
14
00:01:17,000 --> 00:01:27,000
Question 3 states that the delay-coordinate embedding machinery only works, in theory, if you have an infinite amount of noise-free data, but can still be useful in practice even if those conditions are not met
15
00:01:27,000 --> 00:01:29,000
And this is thankfully true
16
00:01:29,000 --> 00:01:32,000
If this were not true, we could not use delay-coordinate embedding in practice
17
00:01:32,000 --> 00:01:42,000
Luckily, even though this infinite amount of noise-free data is never met in practice, delay-coordinate embedding is still a very useful piece of machinery in the field of nonlinear time-series analysis
18
00:01:42,000 --> 00:01:46,000
For problem 4, were assuming that our time series only has a thousand data points
19
00:01:46,000 --> 00:01:53,000
Question a asks, Its fine to embed this time series in as many dimensions as the false near neighbor method suggests are appropriate, and this is false
20
00:01:53,000 --> 00:01:57,000
The rule of thumb I generally use is you should have ten to the m data points
21
00:01:57,000 --> 00:02:01,000
So in this case, you really shouldnt choose an m any higher than two or three
22
00:02:01,000 --> 00:02:08,000
If you chose to embed in four or five dimensions, for example, you would need ten- or a hundred-thousand data points, respectively
23
00:02:08,000 --> 00:02:11,000
This doesnt mean, however, that the false near neighbor method will never work for this time series
24
00:02:11,000 --> 00:02:18,000
For example, the time series may be just fine to embed in two or three dimensions, and in this case, this method would work just fine
25
00:02:18,000 --> 00:02:19,000
So b is false
26
00:02:19,000 --> 00:02:24,000
And part c asks, You really should not embed it in more than two or maybe three dimensions, and this is true
27
00:02:24,000 --> 00:02:29,000
Since 1,000 is ten cubed, I would feel just fine embedding this in two or three dimensions
28
00:02:29,000 --> 00:02:38,000
For Question 5, repeating the same nonlinear time-series analysis procedures separately on chunks of your data set for example, the first half and then the second half can tell you one of two effects
29
00:02:38,000 --> 00:02:42,000
It can tell you that you have inadequate data, or it can tell you that you have a nonstationary system
30
00:02:42,000 --> 00:02:44,000
But it cant really tell you which one of these is occurring
31
00:02:44,000 --> 00:02:54,000
For example, if you did an analysis on a full time series, and then you did an analysis on the first and second half, and you got the same answer, then you can probably assume that you had a stationary time series and you had an adequate amount of data
32
00:02:54,000 --> 00:03:04,000
However, if you got a different answer on the first and second half as compared to the full time series, you couldnt tell if the time series was nonstationary, and a transition had occurred between the first and second half
33
00:03:04,000 --> 00:03:07,000
Or if you have inadequate data going to half the time series
34
00:03:07,000 --> 00:03:13,000
Repeating the same nonlinear time-series analysis procedures on half the time series or small chunks of the time series is a really great idea
35
00:03:13,000 --> 00:03:18,000
But if you get different answers, you need to be very careful about how you interpret these
36
00:03:18,000 --> 00:03:21,000
It could just be inadequate data, but it could also be nonstationaries in the system
37
00:03:21,000 --> 00:03:24,000
In either case, more exploration needs to occur