MLNeurons Logo

MLNeurons Artificial Intelligence 🤖, Machine Learning ⚙

Finance
Limit order book †
Bird's eye view

Limit order books (LOBs) need to be understood to gain a clearer insight into how best to act in a given market situation, optimize order execution strategies, minimize market impact, design better electronic trading strategies and assess market stability. This article attempts to provide a broad overview (with details in subsequent articles) addressing formal definitions, practical aspects of trading via LOBs, examining empirical studies of LOBs, examining selection of LOB models, and discussing unresolved problems related to LOBs.

Preliminaries

An order \( x = (p_x, \omega_x , t_x ) \) submitted at time \( t_x \) with price \( p_x \) and size \( \omega_x > 0 \) (respectively, \( \omega_x < 0 \)) is a commitment to sell (respectively, buy) up to \( |\omega_x | \) units of the traded asset at a price no less than (respectively, no greater than) \( p_x \).

The lot size \( \sigma \) of an LOB is the smallest amount of the asset that can be traded within it. All orders must arrive with a size \( \omega_x \in \{ \pm k\sigma \mid k = 1, 2, \dots \} \).

The tick size \( \pi \) of an LOB is the smallest permissible price interval between different orders within it. All orders must arrive with a price that is specified to the accuracy of \( \pi \).

The lot size \( \sigma \) and tick size \( \pi \) of an LOB are collectively called its resolution parameters.

When a buy (respectively, sell) order \( x \) is submitted, an LOB's trade-matching algorithm checks whether it is possible to match \( x \) to some other previously submitted sell (respectively, buy) order. If so, the matching occurs immediately. If not, \( x \) becomes active, and it remains active until either it becomes matched to an incoming sell (respectively, buy) order or it is cancelled.

An LOB \( \mathcal{L}(t) \) is the set of all active orders in a market at time \( t \). The active orders in an LOB \( \mathcal{L}(t) \) can be partitioned into the set of active buy orders \( \mathcal{B}(t) \), for which \( \omega_x < 0 \), and the set of active sell orders \( \mathcal{A}(t) \), for which \( \omega_x > 0 \). An LOB can then be considered as a set of queues, each of which consists of active buy or sell orders at a specified price.

The bid-side depth available at price \( p \) and at time \( t \) is \[ n^b(p, t) := \sum_{\footnotesize{ \{ x \in \mathcal{B}(t) \mid p_x = p \} }} \omega_x. \] The ask-side depth available at price \( p \) and at time \( t \), denoted \( n^a(p, t) \), is defined similarly using \( \mathcal{A}(t) \).

The bid-side depth profile at time \( t \) is the set of all ordered pairs \( (p, n^b(p, t)) \). The ask-side depth profile at time \( t \) is the set of all ordered pairs \( (p, n^a(p, t)) \).

The mean bid-side depth available at price \( p \) between times \( t_1 \) and \( t_2 \) is \[ \overline{n}^b(p, t_1, t_2) := \frac{1}{t_2 - t_1} \int_{t_1}^{t_2} n^b(p, t) dt. \] The mean ask-side depth available at price \( p \) between times \( t_1 \) and \( t_2 \), denoted \( \overline{n}^a(p, t_1, t_2) \), is defined similarly using the ask-side depth available.

The bid price at time \( t \) is the highest stated price among active buy orders at time \( t \), \[ b(t) := \max_{\footnotesize{\{x \in \mathcal{B}(t)\}}} p_x. \] The ask price at time \( t \) is the lowest stated price among active sell orders at time \( t \), \[ a(t) := \min_{\footnotesize{\{x \in \mathcal{A}(t)\}}} p_x. \] The bid-ask spread at time \( t \) is \( s(t) := a(t) - b(t) \).

The mid price at time \( t \) is
\( m(t) := [a(t) + b(t)] / 2 \).

For a given price \( p \), the bid-relative price is \( \delta^b(p) := b(t) - p \) and the ask-relative price is \( \delta^a(p) :=p - a(t) \).

For a given order \( x = (p_x, \omega_x, t_x) \), the relative price of the order is \[ \delta^x := \begin{cases} \delta^b(p_x), & \text{for buy order,} \\ \delta^a(p_x), & \text{for sell order.} \end{cases} \] The bid-side depth available at relative price \( p \) and at time \( t \) is \[ N^b(p, t) := \sum_{\footnotesize{ \{ x \in \mathcal{B}(t) \mid \delta^x = p \} }} \omega_x. \] The ask-side depth available at relative price \( p \) and at time \( t \), denoted \( N^a(p, t) \), is defined similarly using \( \mathcal{A}(t) \).

The bid-side relative depth profile at time \( t \) is the set of all ordered pairs \( (p, N^b(p, t)) \). The ask-side relative depth profile at time \( t \) is the set of all ordered pairs \( (p, N^a(p, t)) \).

The mean bid-side depth available at relative price \( p \) between times \( t_1 \) and \( t_2 \) is \[ \overline{N}^b(p, t_1, t_2) := \frac{1}{t_2 - t_1} \int_{t_1}^{t_2} N^b(p, t) dt. \] The mean ask-side depth available at relative price \( p \) between times \( t_1 \) and \( t_2 \), denoted \( \overline{N}^a(p, t_1, t_2) \), is defined similarly using the ask-side relative depth available.

Most traders assess the state of \( \mathcal{L}(t) \) via the relative depth profile, and several studies have concluded that order arrival rates depend on relative prices rather than actual prices.

The actions of traders in an LOB can be expressed solely in terms of submissions or cancellations of orders of the lot size. Orders that result in an immediate matching upon submission are known as market orders. Orders that do not, instead becoming active orders, are known as limit orders. In LOBs, the rules that govern matchings dictate how prices evolve through time.

Challenges of studying LOBs

(1) Both the prominent schools of thought regarding modelling LOBs viz. 'perfect rationality' and 'zero intelligence' make assumptions which are inconsistent with empirical observations.

(2) It is a well-established empirical fact that current order flows depend on both \( \mathcal{L}(t) \) and on recent order flows. But the problem with studying conditional behaviour is that the state space of an LOB is huge: if there are \( P \) different choices for price in a given LOB, then the state space of the current depth profile alone, expressed in units of the lot size \( \sigma \), is \( \mathbb{Z}^P \). Therefore, a key modelling task is to find a way to simplify the evolving, high-dimensional state space, while retaining an LOB's important features.

(3) Order flow creates a strong coupling between \( a(t) \) and \( b(t) \) which makes LOB modelling difficult.

(4) LOB models need to take into consideration priority mechanisms (price-time, pro-rata, price-size, etc.)

(5) An LOB \( \mathcal{L}(t) \) reflects only the subset of trading intentions that traders have announced up to time \( t \). However, the fact that no traders have submitted a limit order at a given price does not imply that none of them want to trade at this price, because they could be keeping their intentions private by submitting orders only when absolutely necessary. This creates a incomplete sampling issue due to hidden liquidity.

(6) Volatility is a measure of the variability of returns of a traded asset and for traders who wish to manage their risk exposure, volatility is an important consideration when choosing the assets in which to invest, and, therefore, often forms the basis of optimal portfolio construction. No estimate of volatility gleaned from a model takes into account the full state of \( \mathcal{L}(t) \).

(7) An LOB's resolution parameters \( \sigma \) and \( \pi \) greatly affect trade in it and could possibly make it difficult for traders to monitor the state of the market in real time. And to top it values of \( \sigma \) and \( \pi \) vary greatly between different trading platforms.

(8) The existence of bilateral trade agreements which decides whether 2 traders will or will not trade with each other make modelling of specific matchings and of the evolution of \( \mathcal{L}(t) \) a very difficult task.

(9) Many exchanges suspend standard limit order trading at the beginning and end of the trading day and instead use an auction system to match orders.

(10) Statistical analysis is fraught with difficulties because assumptions such as independence and stationarity, which are often required to ensure consistency of estimation, are rarely satisfied by LOB data.

Empirical observations in LOBs

The empirical literature on LOBs is very large, yet different studies often present conflicting conclusions. Reasons for this include different trade-matching algorithms operating differently, different asset classes being traded on different exchanges, differing levels of liquidity in different markets, and different researchers having access to data of differing quality. Furthermore, as traders' strategies have evolved over time, so too have the statistical properties of the order flow they generate.

Examples of empirical observations are order size, relative price, order cancellations, mean relative depth profile, conditional frequencies of events, market/ price impact and stylized facts.

Modelling LOBs

Models are either created by the economics or the physics community. Economists have tended to be trader-centric, using perfect-rationality frameworks to derive optimal trading strategies given certain market conditions, treating order flow as static. By contrast, models from physicists have tended to be conceptual toy models of the evolution of \( \mathcal{L}(t) \). By relating changes in order flow to properties of \( \mathcal{L}(t) \), these models treat order flow as dynamic.

The two approaches have different strengths: an understanding of trading strategies is crucial for traders and regulators; an understanding of the state of \( \mathcal{L}(t) \) and order flow provides insight into the origins of statistical regularities, including whether they are a consequence of market microstructure or of traders' strategic behaviour.

LOB models need to be assessed in terms of their ability to accurately mimic the trading mechanism and to reproduce empirical facts.

Key unresolved problems

(1) No single model has yet been capable of simultaneously reproducing all of the statistical regularities, and there is no clear picture about how the stylized facts emerge as a consequence of the actions of many heterogeneous traders.

(2) Many LOB models are built upon statistical regularities observed in old data, but traders' strategies and the rules governing trade change over time, so empirical observations from more than a decade ago may not accurately describe current LOB activity.

(3) Almost all LOB models focus on some form of equilibrium. However, empirical evidence strongly suggests that LOBs are subject to frequent shocks in order flow that cause them to display non-stationary behaviour, so they may never settle into equilibrium.

(4) Volatility estimates need to incorporate the entire state of \( \mathcal{L}(t) \) but it is not even agreed whether the number/ size of market orders, or the fluctuation in liquidity play the dominant role in determining market volatility.

(5) From a regulatory standpoint, it is crucial to understand how electronic trading algorithms affect market stability, yet almost nothing concrete is currently known.

(6) Due to the same asset being traded on several different electronic trading platforms simultaneously the study of any individual LOB in isolation no longer provides a snapshot of the whole market for an asset. Furthermore, differences between different trading platforms' matching rules and transaction costs complicate comparisons between different LOBs for the same asset.

Conclusion

LOBs are a rich and exciting testing ground for theories, and have provided new insight into longstanding economic questions regardingmarket efficiency, price formation, and the rationality of traders. However, despite the progress made on specific aspects of limit order trading, it remains unclear how the various pieces of the puzzle fit together. For example, models that capture the dynamics of event-by-event price changes poorly reproduce price dynamics on longer timescales. Similarly, models that explain price dynamics on inter-day timescales offer little understanding of how they emerge as the aggregate effect of individual trades.

LOBs have revolutionized trading by providing traders the freedom to evaluate their own need for immediate liquidity. Their study has hitherto been hampered by their inherent complexity, with all the associated technical difficulties, and above all by wholly inadequate empirical data.

Hopefully, with the advent of AI LLMs, there would be more access to high quality recent data along with compute power to go with it.

Sudhir Shetty, Mar 03 2025.
† Significant portions adapted from arXiv:1012.0349.