Modern data lakehouse platform. Plugin-driven. Storage-agnostic.
- Template-first projects —
phlo initcreates focused starters for CSV, REST APIs, dbt medallion projects, Sling replication, and observability demos - Decorator-driven ingestion —
phlo-dltandphlo-slingregister assets without hand-written Dagster boilerplate - Type-safe quality contracts —
phlo-panderaschemas validate data before it lands in managed tables - Capability plugins — packages contribute services, CLI commands, assets, resources, catalogs, hooks, and Observatory surfaces through Python entry points
- Storage-agnostic data plane — Iceberg, Delta, ClickHouse, Trino, MinIO, RustFS, Nessie, and PostgreSQL can be composed as needed
- Operator surfaces —
phlo-api, Observatory, MCP, PostgREST, Hasura, Superset, pgweb, and observability packages expose runtime state and actions - Local-first operations —
phlo servicesgenerates and runs the project stack through Docker or Podman
from pathlib import Path
import pandas as pd
import pandera.pandas as pa
from pandera.typing import Series
import phlo
class EventsSchema(pa.DataFrameModel):
id: Series[int] = pa.Field(ge=1)
name: Series[str]
value: Series[int] = pa.Field(ge=0)
class Config:
strict = True
coerce = True
@phlo.ingest.dlt(
table_name="events",
unique_key="id",
validation_schema=EventsSchema,
group="demo",
freshness_hours=(1, 24),
)
def events(partition_date: str):
return pd.read_csv(Path("data/events.csv"))phlo.ingest is the provider-neutral ingestion namespace. Use phlo.ingest.dlt(...)
for Python, REST, and DataFrame sources; phlo.ingest.sling(...) for replication;
or phlo.ingest.provider("name") for third-party ingestion providers. Existing
@phlo.ingestion(...) workflows remain supported as a DLT compatibility alias.
# Install with default plugins
uv pip install phlo[defaults]
# Initialize a runnable local starter
phlo init my-project --template csv-batch
cd my-project
# Generate service configuration, start services, and materialize
phlo services init
phlo services start
phlo materialize dlt_events --partition 2026-05-04Full documentation source lives under docs/index.md. The published site is generated with pymdx, matching the GitHub Pages workflow:
pymdx generate src/phlo --docs docs --output docs-site
pymdx build docs-sitePrimary entry points:
- Installation Guide
- Quickstart Guide
- Core Concepts
- Developer Guide
- Plugin Development
- Workflow Development
- CLI Reference
- Configuration Reference
- Operations Guide
uv pip install -e . # Install Phlo in dev mode
make check # Lint, format, typecheck, and test (parallel)
# Services
phlo services start # Start infrastructure
phlo services stop # Stop services
phlo services logs -f # View logs
# Individual gates
uv run ruff check . # Lint
uv run ruff format . # Format
uv run ty check # Typecheck
uv run pytest # TestPhlo is a monorepo of composable packages — install only what you need:
| Layer | Packages |
|---|---|
| Orchestration | phlo-dagster |
| Ingestion | phlo-dlt, phlo-sling |
| Quality | phlo-pandera |
| Transforms | phlo-dbt |
| Table formats | phlo-iceberg, phlo-delta, phlo-clickhouse |
| Infrastructure | phlo-traefik, phlo-postgres, phlo-oauth2-proxy |
| Storage | phlo-minio, phlo-rustfs |
| Catalog | phlo-nessie, phlo-openmetadata |
| Query | phlo-trino |
| Observability | phlo-otel, phlo-clickstack, phlo-grafana, phlo-prometheus, phlo-loki, phlo-alloy, phlo-alerting |
| UI | phlo-observatory, phlo-pgweb, phlo-superset |
| API | phlo-api, phlo-mcp, phlo-hasura, phlo-postgrest |
| Dev/Test | phlo-testing |
