Skip to content

Quant Lab demo #13

@rzimmerdev

Description

@rzimmerdev

Issue 1: Redesign Storage Layer for Structured Data

  • Replace JSON-per-key in HDF5 with structured tables (compound datasets).
  • Ensure datasets are appendable (maxshape=None, chunks=True).
  • Make columns explicitable (date, symbol, open, close, volume, factors).

Example: HDF5 compound dataset or Parquet + Polars.

Issue 2: Implement Asset Metadata System

  • Store asset metadata separately: symbol, class, region, sector, market cap, liquidity, currency, dividend yield, factor scores.
  • Allow fast filtering and universe selection without scanning time-series.
  • Maintain mapping between metadata and corresponding data storage.

Issue 3: Create Hierarchical Filtering Pipeline

  • Pipeline & Filter classes
  • Implement a multi-step filtering process before strategy/backtesting:
  • Remove illiquid/extreme assets
  • Select universe by strategy/asset class
  • Compute risk metrics, factor exposures
  • Feed filtered dataset into backtest or optimizer

Issue 4: Define Universe Templates

  • Predefine pools of assets for repeated strategy testing (e.g., “Global Equities”, “Brazil Bonds”, “Multi-asset ETFs”).
  • Templates should include filtering criteria: liquidity, size, asset class, region.
  • Ensure templates are easily selectable and interchangeable in backtests.

Issue 5: Strategy-Specific Views

  • Each strategy should work on its own filtered subset of the universe.

Examples: Momentum strategy → top 1000 liquid equities; Value strategy → equities by book-to-price ratio; Multi-asset → ETFs across classes/regions.

  • Supports modular strategy testing and avoids data contamination across strategies.

Issue 6: Incremental Updates & Factor Computation

  • Support appendable time-series updates.
  • Precompute and cache factor scores, correlations, and risk metrics monthly/quarterly.
  • Maintain optional index mapping symbols to file locations for fast access.

Issue 7: Integrate Queryable & Searchable Storage

  • Support efficient filtering, sorting, and selection on structured datasets.
  • try for HDF5 + PyTables or Parquet + Polars.
  • Include examples of common queries (filter by symbol/date, sort by factor).

Issue 8: Testing & Migration Plan

  • Plan migration from current JSON-based storage to new system.
  • Implement tests to ensure append, query, and filter operations return correct results.
  • Benchmark read/write speeds, especially for thousands of assets and years of daily data.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions