Add OPRA options support to DataBentoProvider (chain discovery + OHLCV + consolidated quotes)

## Summary

I'd like to gauge interest in adding **OPRA options** support to the existing `DataBentoProvider`. Today the Databento integration covers equities and CME **futures** (`GLBX.MDP3`) only — there's no path to options data, even though Databento exposes the full OPRA feed (`OPRA.PILLAR`) through the same `Historical` client already vendored.

Before writing any code or tests, I wanted to confirm this is in scope for the library (the `futures/` module looks deliberately CME-focused, so options may be an intentional boundary rather than a gap).

## Motivation

Options are the one major asset class in the README's Databento coverage line ("CME, CBOE, ICE futures/options") that isn't actually reachable. A single `DataBentoProvider` already normalizes Databento → the standard Polars schema and plugs into `DataManager`; options OHLCV could ride the same path with a dataset/symbology switch.

## What I'm proposing

Extend `DataBentoProvider` (not a new top-level module), since the work splits cleanly along the existing provider contract:

| Capability | Fits `BaseProvider`? | Proposed surface |
|---|---|---|
| Single-contract OHLCV (OSI symbol) | ✅ it *is* OHLCV | `fetch_ohlcv(...)` with `dataset="OPRA.PILLAR"`, `stype_in="raw_symbol"` |
| Chain discovery (definition + filter) | ❌ extra method | `fetch_option_chain(underlying, date, *, expiry=None, spot=None, moneyness=None, right="both")` |
| Consolidated bid/ask | ❌ non-OHLCV columns | `fetch_option_quotes(contract, schema="cbbo-1m", ...)` |

This mirrors how `fetch_continuous_futures()` already sits alongside `fetch_ohlcv()` for the futures case.

## Findings from a working prototype

I built a standalone script against `databento.Historical` to validate behavior end-to-end (happy to share it / attach as a gist). A few things any implementation needs to handle — and which I think are worth encoding so users don't hit them:

1. **Request size scales with schema, not asset class.** For a full chain (parent symbology), `get_billable_size` grows by **orders of magnitude** moving from `ohlcv-1d` (small) to `trades` (very large); single-contract pulls are negligible by comparison. *Actual* dollar cost depends on the user's plan and streaming-vs-batch mode, so the API should (a) make chain-wide *price/quote* pulls deliberate rather than accidental, and (b) expose `get_billable_size` / `get_cost` so users see size and their own quote before committing.

2. **`estimate_cost()` caveat (existing futures code).** The current `ContinuousDownloader.estimate_cost()` is a flat `years × constant` heuristic that **ignores schema** — it returns the same estimate for `ohlcv-1d` and `ohlcv-1m`, which can be very wrong for finer schemas. For options I'd lean on `metadata.get_billable_size()`/`get_cost()` directly. (Possibly worth a separate `fix:` for futures, but flagging here.)

3. **OPRA OHLCV is multi-publisher.** Each contract gets one bar **per reporting venue** (~17), so raw row counts are `dates × venues` and volume is split. Needs a documented consolidation step. The **consolidated quote** schemas (`cbbo`/`tcbbo`/`cmbp-1`) are single-publisher (id 30) and need no dedup.

4. **Quote availability is per-schema.** `get_dataset_range` shows `cbbo-1m` back to 2013, `cmbp-1`/`tcbbo` to 2023, but `cbbo-1s` only from 2025-02-20. A runtime check beats any hardcoded date.

5. **Index roots split.** SPX has `SPX` (AM-settled monthlies) and `SPXW` (PM weeklys/EOM) as separate parent chains — full coverage means querying both.

## Open questions

- **In scope?** Is OPRA options something the library wants, or is Databento intentionally futures-only here?
- **Provider vs. module:** extend `DataBentoProvider`, or a parallel `options/` module like `futures/`?
- **Quotes scope:** include bid/ask (`cbbo`/`tcbbo`) in a first cut, or OHLCV + chain only to start?
- **Greeks/IV:** out of scope (Databento doesn't provide them; they'd be a downstream computation)?

## Scope / non-goals (if welcomed)

- First PR would target **OHLCV + chain discovery** with mocked-client unit tests (core lane) and `@pytest.mark.integration`/`paid_tier` for live calls; quotes could be a follow-up.
- No greeks/IV, no live/streaming.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add OPRA options support to DataBentoProvider (chain discovery + OHLCV + consolidated quotes) #25

Summary

Motivation

What I'm proposing

Findings from a working prototype

Open questions

Scope / non-goals (if welcomed)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Capability	Fits `BaseProvider`?	Proposed surface
Single-contract OHLCV (OSI symbol)	✅ it is OHLCV	`fetch_ohlcv(...)` with `dataset="OPRA.PILLAR"`, `stype_in="raw_symbol"`
Chain discovery (definition + filter)	❌ extra method	`fetch_option_chain(underlying, date, *, expiry=None, spot=None, moneyness=None, right="both")`
Consolidated bid/ask	❌ non-OHLCV columns	`fetch_option_quotes(contract, schema="cbbo-1m", ...)`

Add OPRA options support to DataBentoProvider (chain discovery + OHLCV + consolidated quotes) #25

Description

Summary

Motivation

What I'm proposing

Findings from a working prototype

Open questions

Scope / non-goals (if welcomed)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions