Protocol Interfaces¶
Contracts that pluggable pipeline components must satisfy. Uses structural (duck) typing — implement the right method signature and it works, no inheritance required.
| Protocol | Method | Purpose |
|---|---|---|
EventLoader |
load(source) → DataFrame |
Parse raw data into events |
TradeSource |
load(events, source) → DataFrame |
Build the canonical trades DataFrame |
DataWriter |
write(data, dest) |
Serialize pipeline outputs |
Format |
factory methods | Bundle loader, trade source, and writer for a venue |
EventLoader ¶
Bases: Protocol
Loads raw order-book events from a data source.
The returned DataFrame must contain at least the columns required by
ob_analytics.schemas.validate_events_df.
load ¶
Load events from source and return a DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source
|
Any
|
Data source identifier. The canonical type is |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
Events with at least the columns required by
|
TradeSource ¶
Bases: Protocol
Builds the trades DataFrame for a given run.
Implementations read explicit trade records (a separate
trades.csv, LOBSTER execution rows embedded in the events
frame, etc.) and project them into the canonical trades schema.
Returned DataFrame columns:
timestamp— pandas datetime64[ns]price— floatvolume— floatdirection— categoricalbuy/sell(taker side)maker_event_id— integer event id of the resting ordertaker_event_id— integer event id of the aggressing ordermaker— order id of the resting ordertaker— order id of the aggressing ordermaker_og— original_number of the maker eventtaker_og— original_number of the taker event
load ¶
Build the trades DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
events
|
DataFrame
|
The processed events frame (post-loader). |
required |
source
|
Any
|
The same |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
|
DataWriter ¶
Bases: Protocol
Writes pipeline results to a format-specific output.
write ¶
Write pipeline DataFrames to dest.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
dict of str to DataFrame
|
Pipeline output keyed by name (e.g. |
required |
dest
|
str or Path
|
Output path (file or directory, format-dependent). |
required |
Format ¶
Bases: Protocol
Structural contract for a data-format descriptor.
A Format bundles the per-format factories the pipeline needs: how to
load events, how to acquire trades, and (optionally) how to write
results or compute depth directly. Pass instances to
Pipeline(format=...).
There is no base class to inherit — any object providing these
members satisfies the contract (structural typing). name is a
short lowercase identifier (e.g. "bitstamp").
create_loader ¶
Return a loader for this format.
create_trade_source ¶
Return a trade source for this format.
create_writer ¶
Return a writer for this format, or None if unsupported.
compute_depth ¶
compute_depth(
events: DataFrame,
config: Any,
source: Any,
ctx: RunContext,
) -> tuple[pd.DataFrame, pd.DataFrame] | None
Return (depth, depth_summary) to override the standard
depth pipeline, or None to use it.
config_defaults ¶
Return default :class:PipelineConfig overrides for this format.
required_context ¶
:class:RunContext field names this format requires.
E.g. LOBSTER returns ["trading_date"] because its filenames carry
no date; Bitstamp returns []. Lets the CLI/pipeline validate
required context generically instead of special-casing format names.
Callers should treat a missing method as [] (structural default).