Dust flux, Vostok ice core

Dust flux, Vostok ice core
Two dimensional phase space reconstruction of dust flux from the Vostok core over the period 186-4 ka using the time derivative method. Dust flux on the x-axis, rate of change is on the y-axis. From Gipp (2001).
Showing posts with label Century Casinos. Show all posts
Showing posts with label Century Casinos. Show all posts

Friday, October 4, 2013

One more time--the distinction between human- and algo-trading

The markets do not act like they once did. The trading in certain stocks is operating on time-scales so small that they cannot be in response to human thought. Not only are certain individuals able to access key information before others and so respond to news releases faster than the speed of light, but certain entities have free range to post and cancel orders on a microsecond basis, and queue-jump by shaving off (or adding on) tiny fractions of a penny from their orders.

Stocks traded by humans tend to make significant moves on a timescale of minutes to days. Even when there is a news event that radically changes the apparent value of a company, if there are only humans in the market, the move takes time to occur. Below we a couple of charts for Detour Gold (I currently have no position in this stock)


Normally, when looked at on a ms timescale, the graph is not really distinguishable from a straight line.


The little squares occur because all the price-changes I saw in the course of the day were a penny. On this scale it scarcely matters which axis is the current price and which is the lagged-price.

Once the algos get involved, the millisecond phase space plots get a lot more interesting. Some of them are works of art! Below, some plots for Century Casinos (I have no position in this one, either). Data here.



Algos playing tug-o-war.

Nice to look at, but maybe not so nice to trade against.

Remember the adage about playing poker: If you don't know who the sucker is . . .

Friday, December 23, 2011

Innovation in complex systems

Innovation has been on my mind a lot lately. Unfortunately, not the kind that results in iPhones and the like.

We normally think of innovation as a good thing. But not all innovations are good ones. As counterexamples, let's consider recent political innovations in the US that allow indefinite detention without trial of anyone accused of terror-related activities; or the use of Predator drones to target American citizens.

My interest has been innovation in the Earth system--particularly in the behaviour of the climate system over the past two million years. The problem with recognizing innovation is that we tend to interpret any activities in light of what we already know--consequently it is difficult to discover anything new. Our first tendency would be to explain our new observations as a special case of what we already know. We resist the idea that something new is occurring.

The Earth system is driven by a few global parameters which interact with myriads of local agents; yet contrary to expectations instead of dissolving into noise, highly ordered global-scale structure arises. We may call such structures emergent properties, and the means by which they arise is termed emergence.

The problem of how these global structures arise from multitudes of interacting local agents is, shall we say, a non-trivial problem. They are in no way predictable from our knowledge of the local interactions; nevertheless we agree that emergence is in accordance with physical laws.

In earth systems, such emergent properties include plate tectonics, glaciations, superplume events, and some mass extinction events.

The emergent properties of a system may change. These changes may or may not be related to specific change(s) on the local level. For the purpose of this essay, I am referring to such changes as innovation.

Possible examples of innovation in Earth systems include the (somewhat controversial) proposed change in mode of tectonics in Archaean time; (very controversial) Neoproterozoic glaciation (i.e., "snowball Earth"); and magnetic pole reversals.

I have been considering change in operation of the climate system during the Mid-Pleistocene (from about 1 million years ago to about 500 thousand years ago).

I present the following probability density plots of the 2-d phase space reconstructions of the ice volume proxy, produced using the time delay method with a delay of 6 thousand years. Each of the figures below is calculated from 150 thousand years of data.

Starting from the Early Pleistocene . . .



Limit cycles (green dashed ellipses) are common in the Early Pleistocene, less so later.

Areas of Lyapunov stability, labelled A1 and A2, represent relatively ice-free conditions. Current global ice volume is comparable to A2, and A1 represents even less ice than at present. Limit cycles in the Early Pleistocene (representing slow, steady growth and decay of ice sheets) start from either the A1 or A2 condition.





The Late Pleistocene is characterized by discrete areas of high probability, suggesting rapid transitions between longer periods of stability. A2 represents an interglacial condition, and A3 to A6 represent separate metastable ice configurations of greater volume respectively. A6 represents a glacial maximum condition, as we experienced about 18,000 years ago.

Climate dynamics as inferred from global ice volume seems to have changed during the Pleistocene epoch. Was it innovation?

Opinions about what happened during the Mid-Pleistocene include changes in atmospheric CO2 leading to greater glaciations, cumulative cooling in the deep ocean changing the nature of the glacial-interglacial transition, erosive uncovering of crystalline bedrock leading to greater thickness of ice sheets, and spontaneous (chaotic) change. There is general agreement that there is no obvious external forcing or any fundamental change in the low-level dynamics leading to the change in climate behaviour, so it is at least possible to argue that the climate system began to act in an "innovative" fashion (provided we state that we do not view this innovation as having been directed in any way).

Let's look at another system instead--one represented by the share price of Century Casinos.


The chart of the daily closing price looks a little like my portfolio--up to a high in April, and all downhill from there.

The two-dimensional reconstructed phase space doesn't look much different from those of other stocks I've looked at in the past.


Actually, this has been smoothed a little, using a 3-point moving average.

There appears to be nothing interesting in the share price activity over the past year--unless we look at daily high prices instead of closing prices.


And here we see something unexpected--a singular spike in share price on June 21, where the share price bounced between about $3 and $8 several times over the day, on first a one-minute timescale, and around mid-day at a one-second timescale.

To investigate dynamics on this timescale, we have to construct our time-delay phase space with a small lag.


In two seconds of trading we have numerous fluctuations between $3 and $7. Lots of money to be made here! (or there would have been had the exchanges not cancelled all the trades).

A few minutes later we get this over one second.


This is orders of magnitude different from what we see in the annual behaviour of the stock, and even considerably different from the bowl of spaghetti above. This figure actually represents a phase space portrait of a random walk. Yes, you can trade randomly if you are quick enough.

So what is the difference between the trading in CNTY on June 21 and every other day this year? Another innovation--high-frequency trading, but in a form which creates the illusion of liquidity by placing lots of orders and then cancelling them as they begin to be filled. The resulting moves in a stock can be dramatic.

Suppose an institutional investor needs to buy a million shares of CNTY (perhaps part of some proprietary arbitrage position). The buyer looks at the depth chart and sees that there are a million shares being offered at $3, so the buyer attempts to fill the order--only to discover that he gets perhaps a thousand shares, the rest of the offer is cancelled, and there are now a million shares offered at $3.05. The tug-of-war may continue, but if the buyer is motivated, the share price may rise considerably in a remarkably short period of time.

Remember that the original intent of having a bid and ask price is that the various offerings were intended to be sold. The idea that these offerings would be used only as bait and not represent real liquidity is indeed innovative, but unhelpful.

Unlike the change in climate dynamics in the mid-Pleistocene, the change in dynamics in share price of CNTY is symptomatic of a fundamental change in the operation of the market, and this change is detrimental to the majority of its participants.

Wednesday, October 19, 2011

Inference of dynamics for complex systems, part 1

Today I will start over with the analysis of dynamic systems, describing a methodology and some of the rationale behind the interpretations from previous postings, as it occurs to me that all of this stuff, though discussed before, is buried in the archives and is not easy to pull together.

This will also be good for me as I have to put together some kind of paper on the topic for one or more conferences in the first half of next year. GAC, in St. John's next year, will be a given as it is my old alma mater, but I am giving thought to presenting at the upcoming 3rd Multiconference on Complexity, Informatics and Cybernetics.

You are studying an interesting system, with many components. You know that many of the components interact, but you don't know the details of their interaction. If the interactions vary with changing conditions  within the system (feedback) it may be described as a complex adaptive system. Examples of such systems include, but are not limited to, ecosystems and other biological systems, the stock market and other economic systems, the climate system, and some would argue, the entire earth system.

The behaviour of such systems is typically nonlinear, and typically characterized by self-organization and emergent phenomena. The presence of negative feedback gives the system a form of resilience, allowing it to resist perturbations; and the presence of positive feedback causes the system to experience episodes of rapid change, usually resulting in a shift from one equilibrium condition to another. Multistability (the presence of more than one equilibrium condition) is a common feature of such systems.

The system has input signals, which may be time-dependent, however it may be that you are only able to observe some of these signals; furthermore there may be input signals of which you are unaware. There are output signals, which you observe, and compile into one or more time series; however there is no way to know if your output signal is important in terms of developing a global understanding of the system of interest.

There are conditions within the system which influence the manner in which the input signals feed through to the output signals. You may have an inkling of some of these rules (commonly expressed as differential equations) but normally your understanding of these rules is incomplete. You hope to understand your system by deducing these equations on the basis of your observations.

Here are some examples of systems we may wish to study.











Daily closing prices for Detour Gold Corp. (DGC-T), from late November 2009 to October 2011.


Gold-silver ratio.


Case-Shiller index. Data from Robert Shiller data page.



Unemployment rate (from US BLS site).


Trading activity in Century Casinos, June 21, 2011. From Nanex.


Paleoclimate proxy records over the past two million years. Magnetic susceptibility of loess (proxy for Himalayan monsoon strength) at top. Deep water 18-O record (proxy for global glacial ice volume) at bottom.

At first glance, the problem seems insurmountable. How do you study a system when you can't even be sure that your observations are meaningful? What if you have failed to observe the most important observable parameters?

It is especially bad for the geological time series, for in addition to the above problem, there are both errors in measurement and errors in the date (or time) of each observation.

In future installments, we will work through the data sets shown above; but we will start with some thoughts on equilibrium.

Friday, June 24, 2011

Deconstructing algos, part 1

The third part of the series on information theoretic methods of analysis for dynamic systems is taking longer than anticipated. Crunching the numbers is killing me. So I'll take a break from it and look a little farther forward--how we can use the methods I have been describing so far to forensically examine the algorithms used in various high-frequency trading events of the recent past.

As seen on Nanex and Zero Hedge, there has recently been a lot of strange, algorithmically driven behaviour in the pricing of natural gas and individual stock prices on very short time frames. In an earlier article I pointed out that the apparent simple chaos we observe in the natural gas price appeared to be an emergent property of at least two duelling algorithms.

In this series of articles we will begin analysis of the algorithms involved. Today's discussion will mostly focus on framing the issues that must be addressed in order to study unknown algorithms on the basis of their time-varying outputs. Future articles will present results from the various analyses.

We begin by looking at the activity in the natural gas price on June 8, 2011:


Let us also consider the pricing action in CNTY on June 21, 2011:


In both of these examples (many more such examples exist) there are three time series of interest to us--the bid price, the ask price, and the prices of trades. Additional information which may also be of use are such things as volume, size of bids, size of asks, and so on. In principal both the bid and ask prices form continuous series which are prone to instantaneous changes. The actual trades form a discontinuous time series with obsrevations at irregular intervals.

We don't have access to the code involved in these algorithms--nevertheless, we can learn something about the computational processes involved, within certain limitations. Unfortunately, just as is the case in studying time series recorded in rocks, we have to make some assumptions, and the validity of our assumptions goes a long way towards predicting the success of our endeavours.

Our first assumption is that the bid price and the ask price are being set by competing interests. This assumption is extremely important. It is possible that the bid and the ask are both being set by a single entity, or by two closely related entities who are using them to manipulate the natural gas price. We will go though in some detail the reasoning behind our assumption that there are competing interests involved below.

Secondly, we are approaching this problem assuming that prices are set and changed discontinuously in time rather than continuously in time. Subtleties of this assumption are discussed in the introduction of Bosi and Ragot (2010).

The methodologies we will explore are as follows:

Cross-correlation of the bid and ask series over selected windows. We choose limited time intervals rather than the entire record because we expect that each series will sometimes lead and sometimes follow. Peaks here will show whether one of the series leads or trails the other consistently or whether each one leads intermittently, which would support the idea that these are distinct dueling algorithms. It seems likely that the bid price will lead as both are declining, and the ask will lead as both are climbing. We should test this hypothesis.

One goal of this analysis will be to see if we can detect trigger points, where one stops following and begins leading. We will locate the times and see if the trigger can be identified, which is only likely if the trigger is some change in either price series, the price of a trade, the volume of a trade. Unfortunately, many other triggers are possible, and it may not be possible to identify them if they are, for instance, a random number generator seeded by, say, the thousandths-of-a-second digit at the instant of some distant event like the first pitch of a Yankee's game or when the secretary in the front office misspells 'the'.

Phase space reconstruction--the relevant time series (bid prices, ask prices, trade prices) each represent one-dimensional data sets. If the algorithms used can be visualized in higher-dimensional phase space, we may be able to reconstruct the overall architecture.

The advantage of this approach is that in principle the dynamics of the system will be contained no matter which output of the model we use. We only have measurements of the bid price, but have no idea what other outputs are generated by the same algorithm, even if these unknown outputs are critical to the decision-making module of the algo. The reconstructed phase space

The difficulties here are that 1) the function may change from leader to follower so quickly that the resulting trajectory through phase space is too short to interpret; 2) there may be multiple players on both the bid and ask, meaning the reconstructed trajectory through phase space is an amalgamation of two or more different functions, the instant of joining of which may be impossible to determine; and 3) it may prove impossible to properly define windows for the data, again creating an amalgamation in phases space of two or more different functions.

Epsilon machine reconstruction--We will need to try to identify the actual "work" done by these programs. How do they decide on a price? How do they "decide" to drop or raise their offer? Do they change? How are we to recognize when an algorithm changes its behaviour when all we have to deal with is the output? Can we recognize when the structure of the computation involved in the decision-making part of the algorithm changes, given our extremely limited knowledge of that structure?

These questions may be addressed using the Îµ-machine reconstruction approach suggested by Crutchfield (1994). The objective of this approach is to use an open-ended modeling scheme to describe the computational structure objectively, so that different practitioners working on the same data will come up with similar (hopefully identical) constructs. By encouraging an heirarchical architecture of undefined complexity, the method allows investigators to identify changes in behaviour of the the system.

This particular approach is built around discrete computation, so is amenable to data which are discrete rather than continuous in time. We assume that the discrete outputs (the time series, or stream of values) is the result of a computational process which is knowable. The data have to be organized, and (this is the key) repeated states are identified. It is possible that these states will be identified from the reconstructed phase space portraits above; alternatively they may be be defined by particular observations. These states may be identified as key strings of data, or may be recognized in complex functions by reconstructing the state space in a higher dimension. The ordering of the states is significant, as the state that appears first before another particular state is referred to as the predictive state, and the following state is the successor state.

The ε-machine is constructed by identifying all the predictive and successor states and  calculating the probabilities of all of their observed relationships. If more than one ε-machine is inferred, the sequence of these first-order ε-machines can be used to build a higher-order ε-machine. Given sufficient data, you may construct ε-machines of arbitrary order.


Information theory--as seen in recent articles, information theory may be used to characterize the complexity of the ε-machine reconstruction and the probability density. The yet-to-be completed third part of that series concerns methods of using information theory to find the optimum window length for creating a probability density plot of the reconstructed phase space. The subsequent parts of this series will concern itself with the analyses described above on the nat gas and CNTY algos, as well as others as they are found.

Given the limitations of time and computing resources, I can't guarantee a timeline. I regret that my speed of analysis is six or seven orders of magnitude slower than the incidents in real time.