- Project Name: Substrate
- Version: 0.1.0
- Author: Aham Labs
- Date: July 13, 2025
The purpose of this document is to define the functional, non-functional, and system requirements for Substrate — a manifest-driven, memory-aware Python environment manager. It aims to provide a reproducible, optimized, conflict-resolving environment experience for developers, researchers, and teams.
Substrate is a command-line tool written in Go that enables users to:
- Compose layered, declarative Python environments
- Analyze dependency usage and memory footprint
- Resolve version conflicts via microenv isolation
- Support preload/lazy-load strategies for performance tuning
The tool is designed to work with both new and existing Python projects.
| Term | Meaning |
|---|---|
| Manifest | A substrate.yaml file defining the structure of an environment |
| Microenv | An isolated virtual environment for conflicting packages |
| Access Graph | A dependency-use graph built from static AST analysis |
| Digest | Parsing existing requirements.txt or environments into structured manifest |
Substrate reimagines traditional Python env tools (like venv, pipenv, conda, poetry) by introducing structure, analysis, and system-awareness. It is standalone, written in Go, and offers cross-platform binaries.
- Initialize a manifest-based Python environment
- Convert traditional
requirements.txtinto manifest - Analyze imports and memory use
- Suggest optimization strategies
- Compose layered environments with reusable base layers
- Run Python scripts within optimized environments
| Role | Description |
|---|---|
| Developer | Uses CLI to create, analyze, and run envs |
| Researcher | Optimizes memory-heavy ML code |
| DevOps | Builds reproducible pipelines |
| OSS Maintainer | Publishes manifests instead of requirements.txt |
- Parse
substrate.yamland any extendedbase.yaml - Validate schema (env name, python version, memory budget, packages)
- Compose an internal environment graph from layers
- Resolve packages and versions
- Detect transitive dependency conflicts
- Use
isolate: trueto split microenvs - Generate
substrate.lockfor reproducibility
- Parse AST using
tree-sitterorgo-python-ast - Extract import usage
- Build access graph (file → module → package)
- Identify unused dependencies
- Light mode:
gopsutilfor RSS, peaks - Deep mode: call
memray,tracemalloc, or custom profiler - Map memory cost to packages/modules
- Suggest
lazyloading for heavy but rarely used packages
- Create
.substrate/cache with:- base layer
- overlays
- microenvs
- Inject runtime shims (
sitecustomize.py) - Handle
PYTHONPATH, virtualenv, and symlinks
| Command | Purpose |
|---|---|
substrate init |
Create a starter manifest |
substrate digest |
Convert requirements.txt |
substrate analyze |
Scan imports + memory use |
substrate compose |
Build environment |
substrate run script.py |
Run with composed env |
substrate freeze |
Lock current config |
- CLI startup time: < 50ms
- Compose time: < 2s for small envs, < 10s for large ones
- OS Support: macOS, Linux, Windows (via Go binary)
- Python versions: 3.7 – 3.12
- One-line install via
curlor Go - Minimal config required to get started
- Compatible with existing Python tooling
- Plugin-ready architecture for:
- Import checkers
- Memory profilers
- Lockfile generators
- Does not execute user code during analysis
- Uses isolated subprocess for profiling
- Option to sandbox envs using Docker (future)
CLI --> Manifest --> Resolver --> Analyzer --> Profiler --> Composer --> Runner
- Written in Go, modular packages under
internal/ - Uses Cobra for CLI
- Go AST + gopsutil for analysis
- Python envs handled via subprocesses + virtualenv + venv
- Reads from:
substrate.yaml,requirements.txt - Writes to:
.substrate/,substrate.lock, access_graph.json
- Calls
pythonfor version check, module import, memray - Supports injection via
PYTHONPATH,sitecustomize.py
- Remote manifest resolver (cloud registry)
- Docker + remote runner
- GUI for dependency & memory graph
- VS Code extension
- Rust plugin mode for performance-critical analysis
env_name: nlp-env
extends: base.yaml
python_version: "3.11"
memory_budget: 512MB
packages:
- name: torch
version: "2.1.*"
load_strategy: preload
isolate: true
- name: transformers
version: "4.28"
load_strategy: lazy