This repository contains the public, privacy-safe framework for the Personal Data Engine: schema-driven ingestion, normalization, canonical entity modeling, and export lanes across heterogeneous personal datasets.
It intentionally excludes runtime databases, exported archives, logs, and private account artifacts.
- Includes contracts, schemas, and ingestion/normalization scripts.
- Includes empty scaffold folders for local runtime.
- Excludes personal data and generated outputs via
.gitignore.
- Data Workflow: what the platform builds, how data is collected/processed, and how personal data is kept out of this public template.
- Case Study: what was learned building the ingestion and processing system.
This repository uses meta/ as the current workspace root.
meta/raw/: immutable source artifacts and indexesmeta/working/: extracted archives and temporary stagingmeta/normalized/: source-shaped normalized recordsmeta/canonical/: cross-source entities, events, relationships, artifactsmeta/views/: app- and SDK-facing derived datasetsmeta/manifests/: build manifests, source freshness, source healthmeta/schemas/: canonical, view, and manifest contractsmeta/scripts/: platform build and compatibility scriptsmeta/lanes/: lane-specific outputs forconnector,local, andfull-exportruntime/: ephemeral indexes, sqlite databases, state, and logspersonal-server/: adjacent personal server state and exported connector data
Scripts accept CLI args and environment overrides. In docs, prefer repository-relative paths.