Shared schema and artifact contracts for the ML4T library ecosystem.
ml4t-specs provides the small set of shared types that multiple ML4T libraries use to
describe:
- market data column mappings and feed semantics
- artifact metadata and storage conventions
- lightweight YAML/JSON spec payloads
It exists so the higher-level libraries can exchange consistent contracts without re-defining the same dataclasses in multiple repos.
Today it is used by:
ml4t-backtestforFeedSpecand market-data execution semanticsml4t-engineerfor artifact metadataml4t-diagnosticfor artifact and backtest-result integrationml4t-modelsas an optional integration bridge whenml4t-specsis installed
pip install ml4t-specsFeedSpec defines how downstream libraries should interpret a tradable price table:
- timestamp column
- entity column
- price / OHLCV columns
- quote columns
- calendar and timezone
- data frequency and timestamp semantics
from ml4t.specs import FeedSpec
feed = FeedSpec(
timestamp_col="date",
entity_col="ticker",
close_col="settle",
price_col="settle",
calendar="NYSE",
timezone="America/New_York",
data_frequency="daily",
)MarketDataSpec bundles schema, semantics, and artifact metadata into one serializable object.
from ml4t.specs import ArtifactStorage, MarketDataSchema, MarketDataSemantics, MarketDataSpec
spec = MarketDataSpec(
artifact_id="us_equities_daily",
schema=MarketDataSchema(timestamp_col="date", entity_col="ticker", close_col="close"),
semantics=MarketDataSemantics(calendar="NYSE", data_frequency="daily"),
storage=ArtifactStorage(path="data/us_equities_daily.parquet"),
)The base artifact layer gives ML4T libraries a shared way to talk about persisted outputs:
ArtifactKindArtifactStorageArtifactProvenanceArtifactSpec
from ml4t.specs import read_spec_payload, write_spec_payload
write_spec_payload(spec, "market_data.yaml")
loaded = read_spec_payload("market_data.yaml")The public ML4T libraries share a few contract types at their boundaries. Keeping them here:
- reduces duplication
- keeps cross-library serialization consistent
- gives backtest, modeling, engineering, and diagnostics code one shared contract vocabulary
This package is intentionally small. It is a support layer, not a full end-user workflow library.
git clone https://github.com/ml4t/specs.git
cd ml4t-specs
uv sync --dev
uv run ruff check src/ tests/
uv run ty check
uv run pytest tests/ -q
uv buildMIT