Saturday, October 22, 2011

Inference of dynamics for complex systems part 2

Phase space portraits

As we left our last installment we had the problem of a series of observations from some interesting system, and we were seeking a means of understanding it. First of all, however, we had some doubts as to whether the measurements we have made will tell us anything about the system, or whether there will be other information needed in order to make any useful inferences.


Approaches to studying dynamic systems include both qualitative studies of the general trends of a system and quantitative studies in which invariant properties of the system are evaluated [Abarbanel, 1996]. System dynamics are evaluated by reconstructing the system’s phase space, which is a geometrical representation of the system projected in a “space” created of different variables [Packard et al., 1980; Abarbanel, 1996]. The climate system can be described by a phase space with coordinates x1, x2, x3, . . . xn, and the functions x1(t), x2(t), x3(t), . . ., xn(t) (the outputs of the system). As time (t) varies, the sequential plot of points of coordinate {x1(t), x2(t), x3(t), . . ., xn(t)} describes the time evolution of the system in phase space.

The number of output functions (n) is called the embedding dimension [Sauer et al., 1991]. The evolution of the system is marked by the trajectory traced out by sequential plots of individual states with coordinates defined by the values of the n functions at each observed time. Describing the trajectory of the system as it flows through phase space is a qualitative means of characterizing the dynamics of the system. The system may also be characterized quantitatively in terms of its invariant properties, such as the Lyapunov exponents and the correlation dimension of the system, which can be calculated from the phase space portrait [Abarbanel, 1996].

Phase space from multiple time series

How do we select the coordinates? One method is to create a phase space by plotting scatterplots of several different records which have been sampled at the same time intervals. For instance, Saltzman and Verbitsky [1994] created a phase space using, as variables, ice mass, ocean temperature, and atmospheric CO2. The state of the system is defined by its location in phase space at a particular time. The plot of successive states through time traces out the trajectory of the system. Traditionally the trajectory is constructed by drawing a curved line, rather than straight line segments through the states in sequence.


The drawback with the Saltzman and Verbitsky approach in paleoclimate is that is difficult to find many records that have been sampled at the same intervals. You are restricted to the portion of the geologic record covered by the shortest record. Additionally, there are errors in both magnitude and time.

Let's not worry about interpretation yet. Today is only about basic methodologies.

Economic systems can quite profitably be studied using this approach, mainly because there are so many of them, the errors tend to be small (except see here), and the timing is usually well constrained as well. So we can compare US unemployment rate to interest rates, for instance.


Data from BLS site.

Commonly we might look at observations like the one above, and not draw the trajectory (the curve that runs sequentially through the data). Instead, a traditional approach might have been to draw a line of best fit in the hopes of defining a correlation. In looking at the above figure, we see two clusters of observations. Past experience tells us it is risky to define a line of best fit using the traditional methods in this way, as the result is heavily weighted by the line between the centres of each cluster.

Similarly we can look at the average duration of unemployment vs unemployment rate.


Data from BLS site.

Or unemployment rate (vertical axis) vs monetary measures.


Data from BLS and St. Louis Fed site.

Or house prices vs real interest rates.


Data from Shiller [2005].

Defining a phase space from multiple variables requires multiple records. The state space can only be characterized over the duration of the shortest record. Dating errors will lead to various forms of distortion in the projected phase space. The economic time series tend to lend themselves well to this form of projection, because many of them exist to any arbitrary level of precision. If you choose month-end or year-end prices, there are normally no dating errors.

Phase space from a single time series

It is pretty uncommon to have more than one geological time series of sufficient length with good dating control. So geologists will normally have to work with a single time series. The method below can similarly be used in other types of time series as well.

When you have one time series, you may wonder how much dynamic information it contains. Fortunately, ergodic theory suggests that dynamic information about the entire system is contained in each time series output from the system [Abarbanel, 1996]. Therefore, a phase space portrait reflecting the dynamics of the entire system may be reconstructed from a single time series.

Time-derivative method

Packard et al. [1980] propose a method in which the function is plotted along one axis, and its various time derivatives are plotted on the other axes. If we use the simplest two-dimensional case, the graph would consist of a scatterplot of the function against its first time derivative. (i.e. y vs. dy/dt). An example of such a plot appears on the masthead of the blog.


In the above figure, we see the ice volume proxy plotted on the horizontal axis (ice volume increases towards the right) plotted against its first derivative over an interval of time lasting about 120,000 years. The numbers on the graph represent the time in thousands of years before present (ka BP). The rate of change of ice volume is plotted with +ve on top, so that as global ice volume grows (near A, for example), the system will move towards the right through phase space.

Any equilibria in this type of figure must necessarily occur along the zero rate of change axis.

Note the error bars presented on some of the states. Similar error bars would be found at all other states in the figure as well. The error in estimating the rate of change is a consequence of the error in measurement being similar in size to the difference between successive measurements. The size of the error bars is large compared to the variability of some parts of the trajectory--consequently our confidence in this trajectory is not as great as it otherwise might be.

Time-delay method

We reduce these errors by reconstructing the phase space by the time delay method [Packard et al., 1980], in which the elements of a time series are plotted against n-1 lagged observations from the same series (figure 2B). Identifying the lags and the embedding dimension (n) are key decisions in the reconstruction. To simplify things in the following discussion we shall only use two dimensions. Thus we reconstruct our phase space portrait by a scatterplot of the data against a lagged copy of itself. The optimum lag is defined by the first minimum of the average mutual information function [Fraser and Swinney, 1986]; however for quasiperiodic data we find that this tends to be the first minimum of the autocorrelation function (about ¼ of the period of the dominant waveform).

Thus for ice volume:


Here we are looking at a two-dimensional phase space reconstructed from ice volume proxy data covering about 200,000 years. In this projection, lower glacial ice volume is at the lower left corner of the plot, with greater ice volume towards the upper right corner. We'll interpret these later. Moving on


Case-Shiller index



Official unemployment rate


Detour Gold Corp.


CNTY busted trades (1 s of trading activity each figure)


Gold-silver ratio in phase space

Dynamic systems, like climate, have historically been analyzed using power spectral methods, such as the Fourier transform and wavelet analysis [Hays et al., 1976; Imbrie et al., 1992]. This has been a reflection of the predominantly linear assumptions underlying early analytical methods.

The power spectrum is not an invariant property of a nonlinear time series [Abarbanel, 1996], meaning that significant changes may appear in the power spectrum despite the lack of changes in the dynamics of the system. Therefore, changes in power spectrum are insufficient evidence to infer changes in dynamics.

In our next installment we'll talk a bit about equilibrium and what any of the above plots have to say about it.

References

Abarbanel, H. D. I. (1996), Analysis of Observed Chaotic Data, Springer-Verlag, New York.

Fraser, A. M., and H. L. Swinney (1986), Independent coordinates for strange attractors from mutual information, Phys. Rev. A, 33, 1134-1140.

Hays, J. D., J. Imbrie, and N. J. Shackleton (1976), Variations in the Earth’s orbit: Pacemaker of the ice ages. Science, 194, 1121-1132.

Imbrie, J., et al. (1992), On the structure and origin of major glaciation cycles, 1, Linear responses to Milankovitch forcing, Paleoceanography, 7, 701-738, 1992.

Packard, N. H., J. P. Crutchfield, J. D. Farmer and R. S. Shaw (1980), Geometry from a time series, Phys. Rev. Lett., 45, 712-716.

Saltzman, B., and M. Verbitsky (1994), Late Pleistocene climatic trajectory in the phase space of global ice, ocean state, and CO2: observations and theory, Paleoceanography, 9, 767-779.

Sauer, T., J. A. Yorke and M. Casdagli (1991), Embedology. Journal of Statistical Physics, 65, 579-616.

Shiller, R. J. (2005), Irrational Exuberance, 2nd ed., Princeton University Press.

No comments:

Post a Comment