LOBSTER Format¶
Support for LOBSTER message files (event types 1–7), orderbook-backed depth, and round-trip writers.
Use via Pipeline(format=LobsterFormat(...)) or
Pipeline.from_format("lobster", ...).
Key differences from Bitstamp:
- Executions in the message file —
LobsterTradeReaderbuilds trades directly from type 4/5 rows in the events DataFrame (no separate trades file). - Orderbook-backed depth —
LobsterFormat.compute_depthreads the official orderbook file for ground-truth depth instead of reconstructing from events - Integer prices — raw prices are in ten-thousandths of a dollar
(
price_divisor=10000)
LobsterLoader ¶
Load raw limit-order events from LOBSTER message files.
Satisfies the :class:~ob_analytics.protocols.EventLoader protocol.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
PipelineConfig
|
Pipeline configuration. |
None
|
trading_date
|
str or Timestamp
|
The calendar date of the trading session (LOBSTER timestamps are seconds after midnight and need a date anchor). |
required |
load ¶
Load LOBSTER message data and return a cleaned events DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source
|
str, Path, or directory
|
Path to a LOBSTER message CSV, or a directory containing message/orderbook file pairs. When a directory is given the loader auto-discovers files by the LOBSTER naming convention. |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
|
LobsterTradeReader ¶
Build trades directly from LOBSTER execution events.
In LOBSTER, each execution event (type 4 or 5) represents the resting (maker) side of a trade. This reader builds trade records directly from those rows in the events frame; no matching is needed because the data already pairs maker rows with executions.
Satisfies the :class:~ob_analytics.protocols.TradeSource protocol.
load ¶
Build a trades DataFrame from LOBSTER execution events.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
events
|
DataFrame
|
Events with |
required |
source
|
Any
|
Unused; trade information is embedded in events. |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
Trades with |
LobsterWriter ¶
LobsterWriter(
config: PipelineConfig | None = None,
*,
trading_date: str | Timestamp,
price_divisor: int | None = None,
)
Write pipeline events back to LOBSTER dual-file format.
Satisfies the :class:~ob_analytics.protocols.DataWriter protocol.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
trading_date
|
str or Timestamp
|
Calendar date of the session. |
required |
price_divisor
|
int
|
Multiplier to convert decimal prices back to LOBSTER integers. |
None
|
write ¶
write(
data: dict[str, DataFrame],
dest: str | Path,
*,
ticker: str = "DATA",
num_levels: int = 10,
**kwargs: Any,
) -> tuple[Path, Path]
Write events to LOBSTER message + orderbook files.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
dict
|
Must contain |
required |
dest
|
str or Path
|
Output directory. |
required |
ticker
|
str
|
Ticker symbol for filename. |
'DATA'
|
num_levels
|
int
|
Number of orderbook levels to write. |
10
|
Returns:
| Type | Description |
|---|---|
tuple of Path
|
|
LobsterFormat
dataclass
¶
Format descriptor for LOBSTER limit-order-book data.
trading_date is taken from the per-run
:class:~ob_analytics.protocols.RunContext, not the format
constructor — so the same LobsterFormat() instance can be reused
across runs with different sessions.
lobster_depth_from_orderbook ¶
lobster_depth_from_orderbook(
events: DataFrame,
orderbook_path: Path,
config: PipelineConfig,
) -> tuple[pd.DataFrame, pd.DataFrame]
Compute depth and depth summary from the LOBSTER orderbook file.
The LOBSTER orderbook file is ground truth: it records the complete
visible book state after every message event. This function converts
it into the (depth, depth_summary) pair the pipeline expects,
avoiding the need to reconstruct depth from message events (which
fails when events reference pre-market orders absent from the
message file).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
events
|
DataFrame
|
Events DataFrame (used only for timestamps and event IDs). |
required |
orderbook_path
|
Path
|
Path to the LOBSTER orderbook CSV. |
required |
config
|
PipelineConfig
|
Pipeline configuration. |
required |
Returns:
| Type | Description |
|---|---|
tuple of (depth DataFrame, depth_summary DataFrame)
|
|