Data I/O¶
Parquet serialization and writer registry.
load_data ¶
Load pre-processed pipeline data from a Parquet directory or pickle file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str or Path
|
If path is a directory, each |
required |
Returns:
| Type | Description |
|---|---|
dict of str to pandas.DataFrame
|
|
save_data ¶
save_data(
lob_data: dict[str, DataFrame],
path: str | Path,
*,
fmt: str = "parquet",
writer: DataWriter | None = None,
config: Any = None,
ctx: Any = None,
**write_kwargs: Any,
) -> None
Save pipeline data to disk.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lob_data
|
dict of str to pandas.DataFrame
|
The DataFrames to save (keys become file stems). |
required |
path
|
str or Path
|
Destination directory (Parquet) or file (pickle). |
required |
fmt
|
str
|
Serialisation format. Built-in values are |
'parquet'
|
writer
|
DataWriter
|
A pre-constructed writer instance. When provided, fmt is
ignored and the writer is used directly. This is the preferred
path when saving from a :class: |
None
|
config
|
Any
|
Forwarded to a registered writer factory when |
None
|
ctx
|
Any
|
Forwarded to a registered writer factory when |
None
|
**write_kwargs
|
Any
|
Extra keyword arguments forwarded to |
{}
|
register_writer ¶
Register a writer factory under name for use with
save_data(fmt=name, ctx=...).
The factory is called as factory(config, ctx) and must return a
:class:DataWriter. This is what lets format-specific writers
(e.g. :class:~ob_analytics.lobster.LobsterWriter, which needs
trading_date) participate in the registry — they pull required
parameters from the :class:~ob_analytics.protocols.RunContext.