Thursday, June 30, 2011

Deconstructing algos, part 2: Leveraging chaos into high-frequency arbitrage opportunities

The recent elegant explanation for the activities of the HFT algos by Nanex seems likely to be a better one than the analysis that follows, as it answers the all-important question--why? In the following analysis we will look a little bit at how, but most or our interpretation of the results is coloured by the Nanex explanation. It explains why so many trades happen outside the bid-ask spread, particularly as the bid and ask prices are moving rapidly. They are scalping fractions of pennies from some poor fool who has data more than a few ms old.

As this is the reason, the method of choosing bid and ask prices pales in significance next to the methodology of stuffing the orders. This methodology I know nothing about and will not address. This article will address how to use this stuffing to create endless opportunities for arbitrage.

The principal advantage discussed in the Nanex report is stuffing the market with so many orders that competitors have trouble seeing the present state of the market. Whenever such inefficiencies are created, an arbitraging opportunity may also be created.

One method of creating arbitrage opportunities is through manipulation of time. We are accustomed to thinking of information flow as instantaneous, but it is limited by the speed of light. How might HFT take advantage of this?

Imagine for a moment that transatlantic communications were somehow extremely limited, so that a trader in New York could not see the present price of a stock in Paris, but would only see it after a two-hour delay. Any market participant who could somehow overcome this limitation would find a myriad of arbitrage opportunities.

Now look at the present. Let's suppose International Face Sucker (IFS) starts stuffing 100,000 bids per second into the pipe in New York. Let us say that those bids are x1, x2, x3, . . .

A market participant in Californa, Hedge Fool LLP (HF) is in the market and starts looking at the stream of bids coming down the pipe.

At 100,000 bids per second, the electronic signal only travels about  3 km between each bid. So at the time when HF sees the first bid (x1), and prepares its response (say, y1), IFS is actually sending quote x1500 into the dataverse. Where is the market? What is the current price?

Now suppose IFS has a branch in California. They have the same algo as IFS New York, and are running it so locally they observe x1 and HF's response y1--but they already know what x1500 is (or is that "is going to be"?), not to mention all of x2 through x1499. Might there be an arbitrage opportunity? Might there be 100,000 such opportunities every second?

A fraction of a penny 100,000 times a second--it isn't long before you're into real money.

Now IFS has branches in London, Paris, Sydney, Tokyo, Shanghai, Moscow--they are all running these arbitrage trades and who knows--maybe they are stuffing orders into their local bourses, using an algo known by all other branches and are arbitraging them as well.

The role of chaos

Not that it matters much, but what sort of algos are they using? I think they are mostly pretty simple.

The algo on the bizarre spreads seen here is straight forward, but hard to see how it profits.



As I've written elsewhere, the nat gas trading algo looked very similar to a simple chaotic function--the first such function ever identified.

 

Natural gas over a brief interval on June 8, 2011. Graph from Nanex.


 Nat gas price from above graph plotted against linear time.


Plot of first 5500 values for x using the Lorenz equation with parameters σ = 10, ρ = 24.7, β = 8/3.

You might think that using such a simple, well known function would mean that anyone could tag along for the ride. But you would be mistaken. Chaotic functions have a property called sensitivity to initial conditions, which makes them very useful in this particular application. Even in the unlikely event that some disgruntled former employee steals the software, its use will be extremely limited.

Note in the equations above we have three flow parameters, σ, ρ, and β, for which there are an unlimited number of choices resulting in chaotic behaviour of the overall function. In addition, we may choose any starting location, and we can also vary the time steps (basically x2 = x1 + time-step * dx/dt). Any arbitrarily small difference in any of the above parameters/variables leads to a dramatically different future evolution of the function. For instance, the two plots below (blue and red) are identical in all respects except for blue, σ = 10.01, and for red, σ = 10.00 (where only the red appears, the two curves are essentially identical).

The plot above represents about 16000 intervals, which could probably be squeezed into fewer than 5000 quotes, which IFS could blast out in maybe a twentieth of a second. If HF had stolen the program, and entered every parameter correctly, except for a typo ("10.01" instead of "10.00") then their estimate of the IFS bids will only be accurate for only about 25 ms. After that, HF might as well guess.

We could imagine IFS deciding on the next day's choice of parameters late in the evening, sending the numbers in an encrypted message to all their offices worldwide, and the next day they are all happily arbitraging away 100,000 times a second. They could change the parameters on an hourly basis--or every minute--it requires only a small amount of information to unspool an unlimited number of bids.

The only practical use for this software, if stolen, would be to use the same quote-stuffing method so your international subs could arbitrage the hell of the market. But that would be manipulation, if it falls into the wrong (read "your") hands. In the hands of IFS, of course, it is proper and judicious market management.

1 comment:

  1. Hi Mickeyman

    Your blog is great!

    This is how the gimmick with trades executing outside of NBBO works;
    Reg NMS protects only "top of the book", meaning if a part of your order has to be routed to another exchange due to inferior pricing on the current one, only the part matching the best pricing will be routed to take out the NBBO, while the rest will be filled locally no matter what is displayed. Concrete example ($price (size)); BATS is displaying offers $100.00 (100), $100.01 (100) etc. while NYSE is displaying $100.05 (100), $100.06 (200) etc. You want take out 200 shares of the mentioned security sending a marketable order to NYSE; there's (obvious) inferior pricing, so your order has to be routed to the one with the national best offer - the thing is only 100 shares of your order will be routed to BATS, taking out order $100.00 (100), while the rest of the order will be matched to NYSEs' $100.05 (100) and you end up paying $4 more just because you don't have your machines co-located and/or access to exchanges' direct feeds.
    You might say "well, I'm not a dumb ass, so I'll use limit orders" (at a given marketable offer), but you still get shafted. Ok, this time a limit order of 200 shares at $100.00 is sent to NYSE, again 100 shares is routed to BATS to take out the NBO, and you think a limit order of $100.00 (100) should be posted on NYSE. Not happening. UQDF used for compliance with Reg NMS, is slow and will still show the old order at BATS, not allowing you to post anything at $100.00 at NYSE since it would "lock" the markets. Any high frequency trader with direct feeds will see the old order at BATS is gone, but still displayed on UQDF and if he's trading through a broker-dealer (or is broker-dealer) he will circumvent "order protection rule" (i.e. "locking" the market) by sending an Inter-market sweep order at $100.00. Your order at $100.00 will be hidden and/or repriced to $99.99, but you will be left behind the HF trader even though your order came first to NYSE!

    Quote stuffing is as much slowing the competition as masking your trades, i.e. if one would live in pure environment where you act only upon a signal is confirmed your strategies could easily be reverse engineered and used against you. Also, you can use it to see how the competition is reacting. Interesting, quote stuffing is noted in academia quite a while ago, but nobody seemed to pursue it.

    I the case of NatGas I think its a stop-loss buster algo at work.

    As long as we have a market where 99.99% of the people involved don't really understand how it works, one will see this kind of crap running on a daily basis.

    Again, nice blog.

    ReplyDelete