Skip to content

Analytics

Format-agnostic post-processing analytics. These functions work with the output of any format's pipeline run (Bitstamp, LOBSTER, or custom).

Trade Analysis

order_aggressiveness

order_aggressiveness(
    events: DataFrame, depth_summary: DataFrame
) -> pd.DataFrame

Calculate order aggressiveness with respect to the best bid or ask in BPS.

Parameters:

Name Type Description Default
events DataFrame

The events DataFrame (must contain direction, action, type, timestamp, event_id, price columns).

required
depth_summary DataFrame

The order book summary statistics DataFrame (must contain timestamp and event_id columns).

required

Returns:

Type Description
DataFrame

The events DataFrame with an added aggressiveness_bps column.

trade_impacts

trade_impacts(trades: DataFrame) -> pd.DataFrame

Generate a DataFrame containing order book impact summaries.

Aggregates trade records by taker order ID to summarise how each aggressive order swept through the book (price range, number of fills, total volume, VWAP, duration).

Parameters:

Name Type Description Default
trades DataFrame

The trades DataFrame (must contain taker, price, volume, timestamp, direction columns).

required

Returns:

Type Description
DataFrame

A DataFrame summarising market order impacts with columns: id, min_price, max_price, vwap, hits, vol, start_time, end_time, dir.

Order Type Classification

set_order_types

set_order_types(
    events: DataFrame, trades: DataFrame
) -> pd.DataFrame

Determine limit order types.

Classifies each order as one of: market, resting-limit, flashed-limit, or market-limit, based on how the order interacts with the book over its lifetime.

Parameters:

Name Type Description Default
events DataFrame

The limit order events DataFrame.

required
trades DataFrame

The executions DataFrame.

required

Returns:

Type Description
DataFrame

The events DataFrame with an updated 'type' column indicating order types.

Order Book Reconstruction

order_book

order_book(
    events: DataFrame,
    tp: datetime | None = None,
    max_levels: int | None = None,
    bps_range: int = 0,
    min_bid: float = 0,
    max_ask: float = np.inf,
) -> dict[str, datetime | pd.Timestamp | pd.DataFrame]

Reconstruct the order book at a specific point in time.

Parameters:

Name Type Description Default
events DataFrame

DataFrame containing order events.

required
tp datetime or Timestamp

The point in time at which to evaluate the order book. If None, uses the latest event timestamp in the data.

None
max_levels int

The maximum number of price levels to include for bids and asks.

None
bps_range int

Basis points range to filter the bids and asks. Default is 0.

0
min_bid float

Minimum bid price. Default is 0.

0
max_ask float

Maximum ask price. Default is infinity.

inf

Returns:

Type Description
dict[str, datetime or DataFrame]

A dictionary containing: - 'timestamp': The evaluation timestamp. - 'asks': DataFrame of active ask orders. - 'bids': DataFrame of active bid orders.