Thursday, June 30, 2011

Deconstructing algos, part 2: Leveraging chaos into high-frequency arbitrage opportunities

The recent elegant explanation for the activities of the HFT algos by Nanex seems likely to be a better one than the analysis that follows, as it answers the all-important question--why? In the following analysis we will look a little bit at how, but most or our interpretation of the results is coloured by the Nanex explanation. It explains why so many trades happen outside the bid-ask spread, particularly as the bid and ask prices are moving rapidly. They are scalping fractions of pennies from some poor fool who has data more than a few ms old.

As this is the reason, the method of choosing bid and ask prices pales in significance next to the methodology of stuffing the orders. This methodology I know nothing about and will not address. This article will address how to use this stuffing to create endless opportunities for arbitrage.

The principal advantage discussed in the Nanex report is stuffing the market with so many orders that competitors have trouble seeing the present state of the market. Whenever such inefficiencies are created, an arbitraging opportunity may also be created.

One method of creating arbitrage opportunities is through manipulation of time. We are accustomed to thinking of information flow as instantaneous, but it is limited by the speed of light. How might HFT take advantage of this?

Imagine for a moment that transatlantic communications were somehow extremely limited, so that a trader in New York could not see the present price of a stock in Paris, but would only see it after a two-hour delay. Any market participant who could somehow overcome this limitation would find a myriad of arbitrage opportunities.

Now look at the present. Let's suppose International Face Sucker (IFS) starts stuffing 100,000 bids per second into the pipe in New York. Let us say that those bids are x1, x2, x3, . . .

A market participant in Californa, Hedge Fool LLP (HF) is in the market and starts looking at the stream of bids coming down the pipe.

At 100,000 bids per second, the electronic signal only travels about  3 km between each bid. So at the time when HF sees the first bid (x1), and prepares its response (say, y1), IFS is actually sending quote x1500 into the dataverse. Where is the market? What is the current price?

Now suppose IFS has a branch in California. They have the same algo as IFS New York, and are running it so locally they observe x1 and HF's response y1--but they already know what x1500 is (or is that "is going to be"?), not to mention all of x2 through x1499. Might there be an arbitrage opportunity? Might there be 100,000 such opportunities every second?

A fraction of a penny 100,000 times a second--it isn't long before you're into real money.

Now IFS has branches in London, Paris, Sydney, Tokyo, Shanghai, Moscow--they are all running these arbitrage trades and who knows--maybe they are stuffing orders into their local bourses, using an algo known by all other branches and are arbitraging them as well.

The role of chaos

Not that it matters much, but what sort of algos are they using? I think they are mostly pretty simple.

The algo on the bizarre spreads seen here is straight forward, but hard to see how it profits.



As I've written elsewhere, the nat gas trading algo looked very similar to a simple chaotic function--the first such function ever identified.

 

Natural gas over a brief interval on June 8, 2011. Graph from Nanex.


 Nat gas price from above graph plotted against linear time.


Plot of first 5500 values for x using the Lorenz equation with parameters σ = 10, ρ = 24.7, β = 8/3.

You might think that using such a simple, well known function would mean that anyone could tag along for the ride. But you would be mistaken. Chaotic functions have a property called sensitivity to initial conditions, which makes them very useful in this particular application. Even in the unlikely event that some disgruntled former employee steals the software, its use will be extremely limited.

Note in the equations above we have three flow parameters, σ, ρ, and β, for which there are an unlimited number of choices resulting in chaotic behaviour of the overall function. In addition, we may choose any starting location, and we can also vary the time steps (basically x2 = x1 + time-step * dx/dt). Any arbitrarily small difference in any of the above parameters/variables leads to a dramatically different future evolution of the function. For instance, the two plots below (blue and red) are identical in all respects except for blue, σ = 10.01, and for red, σ = 10.00 (where only the red appears, the two curves are essentially identical).

The plot above represents about 16000 intervals, which could probably be squeezed into fewer than 5000 quotes, which IFS could blast out in maybe a twentieth of a second. If HF had stolen the program, and entered every parameter correctly, except for a typo ("10.01" instead of "10.00") then their estimate of the IFS bids will only be accurate for only about 25 ms. After that, HF might as well guess.

We could imagine IFS deciding on the next day's choice of parameters late in the evening, sending the numbers in an encrypted message to all their offices worldwide, and the next day they are all happily arbitraging away 100,000 times a second. They could change the parameters on an hourly basis--or every minute--it requires only a small amount of information to unspool an unlimited number of bids.

The only practical use for this software, if stolen, would be to use the same quote-stuffing method so your international subs could arbitrage the hell of the market. But that would be manipulation, if it falls into the wrong (read "your") hands. In the hands of IFS, of course, it is proper and judicious market management.