Define the search space. The agent finds the path.
Most agent frameworks manage loops. portlang manages environments. You declare what the agent can access, what counts as success, and hard limits on cost and scope. The runtime enforces all of it inside an isolated container and records every step.
What enters the context window determines how reliably your agent works. portlang makes the key levers explicit: re_observation pushes current state before each step rather than letting context rot build up, [[verifier]] gives the agent concrete pass/fail results rather than raw output to parse, and [boundary] caps tokens and steps before the model's instruction budget runs out.
macOS (Homebrew):
brew install portofcontext/tap/portlangLinux:
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/portofcontext/portlang/releases/latest/download/portlang-installer.sh | shexport OPENROUTER_API_KEY=...
# or
export ANTHROPIC_API_KEY=...
portlang init --install --start
# Install the agent skill to get guided help while building fields
npx skills add https://github.com/portofcontext/skills --skill portlang# fix-bug.field
name = "fix-bug"
[model]
name = "anthropic/claude-sonnet-4.6"
temperature = 0.2
[prompt]
goal = """
There is a bug in src/ causing the test suite to fail. Find it and fix it.
"""
# Fresh test results before every step, not accumulated
re_observation = ["cd /workspace && python -m pytest tests/ -q 2>&1 | tail -10"]
[environment]
root = "./workspace"
[boundary]
allow_write = ["src/*.py"] # sandbox-enforced: agent cannot touch tests/ to cheat
max_steps = 20
max_cost = "$0.50"
# Success criterion: the test suite must pass
[[verifier]]
name = "tests-pass"
command = "cd /workspace && python -m pytest tests/ -q"portlang run fix-bug.field
portlang converge fix-bug.field -n 20 # run 20 times, measure convergence before shipping
portlang view trajectory <id> # replay any run step-by-stepPlace a parent.field at the root of an eval directory to share model, tools, and boundary config across all child fields:
stripe-benchmark/
parent.field ← shared model + tools
01-get-balance/
get-balance.field ← model = "inherit", tools = "inherit"
02-list-customers/
list-customers.field
portlang eval stripe-benchmark/
portlang view eval <eval-id>| Primitive | Purpose |
|---|---|
| Field | Self-contained unit of work: model, tools, goal, constraints, and verifiers in one file. Named after the physics concept, a region of space with properties defined at every point. The agent moves through it and the field determines what's possible. |
| Vars | Template variables declared in [vars], interpolated via {{ name }}, supplied at runtime with --var |
| Boundary | Hard limits enforced by sandbox: write paths, network policy, step/cost/token caps. The token ceiling keeps the instruction budget under control. |
| Verifier | Deterministic pass/fail signals that run on stop or on each tool call. Gives the agent concrete results rather than raw output to parse. |
| Trajectory | Complete event log of every step, tool call, cost, and outcome. Replayable and diffable. |
| Eval | Batch run of multiple fields with a persistent ID, resumable on failure |
| Skills | Prompt packs loaded into the agent's context from local files, GitHub repos, or ClawHub |
| Output | File artifacts declared via collect in [boundary], delivered via --output-dir or embedded in --json stdout |
Declare skills in any field. The agent receives a brief metadata summary in its system prompt and reads the full guide on-demand from the workspace filesystem.
[[skill]]
source = "./skills/my-guide.md" # local .md file
[[skill]]
source = "./skills/my-skill" # local directory: SKILL.md + scripts/, references/, assets/ copied to workspace
[[skill]]
source = "owner/repo" # GitHub (skills.sh-compatible shorthand)
[[skill]]
source = "clawhub:name" # ClawHub registryYou can reference a skill in the goal with $slug — portlang detects invocations and records them in the trajectory. See examples/07-skills/.
Declare which files the agent produces with collect in [boundary]. These are delivered to the caller after the run — separate from write permission (allow_write):
[boundary]
allow_write = ["report.md", "results/*.json", "scratch/*.tmp"]
collect = ["report.md", "results/*.json"] # omit scratch filesIf collect is not set, all allow_write files are collected. Set collect = [] to collect nothing.
Retrieve outputs after a run:
portlang run field.field --output-dir ./results/ # copy artifacts to ./results/
portlang run field.field --json # machine-readable JSON on stdout
portlang run field.field --json | jq .artifacts # pipe to jq--json embeds both artifacts (file contents, up to 512 KB/file) and structured_output (from output_schema) in one JSON object — the building block for a future HTTP API.
Agent code runs in isolated containers via Apple Container. Network is denied by default. Write access is explicitly granted via glob patterns. Hard ceilings on steps, cost, and context size.
Treat .field files as code. Review tool definitions and boundary policies before running untrusted fields.
portlang can use Claude Code as its agent loop instead of the native loop. This gives the agent the full Claude Code toolset inside of portlang.
portlang run --runner claude-code field.fieldAuth: Run claude setup-token to generate a long-lived OAuth token, or set ANTHROPIC_API_KEY to use the API directly.
Limitations vs native runner: ToolCall verifiers and boundary context tracing are not supported
portlang reflect analyzes a trajectory and identifies concrete ways to improve your field — fewer steps, lower cost, better reliability. The analysis is grounded in the specific steps your agent took and the outcomes it achieved.
portlang run field.field --auto-reflect # reflect automatically after each run
portlang reflect <trajectory-id> # reflect on a past run
reflectis itself a portlang field — see reflect.field.
Example: tool naming is part of the environment
The examples/03-custom-python-tool field uses a Python calculator tool with a function named execute. When run with --runner claude-code, Claude Code uses ToolSearch to discover tools lazily — so the agent searches for "calculator":
→ ToolSearch {"query": "calculator"}
← ToolSearch No matching deferred tools found
→ ToolSearch {"query": "select:mcp__execute__execute"}
← ToolSearch (empty — tool is loaded, not deferred)
→ mcp__execute__execute {"expression": "144 * 259"} ← finally works, by guessing
Two wasted round-trips because the prompt says "calculator" and the tool is named execute. Reflect surfaces this immediately:
HIGH Add 'calculator', 'math', 'arithmetic' as keywords to the tool description so ToolSearch resolves it on the first try. Steps 1–2 are pure tool-hunting waste.
The fix is one word in the Python file:
# before
def execute(expression: str) -> CalculatorResult:
# after
def calculate(expression: str) -> CalculatorResult:portlang auto-extracts the function name as the tool name, so the rename propagates automatically. The agent now has a tool called calculate — semantically close enough to "calculator" that it can orient immediately without guessing.
This is what "environment-first" means in practice: the agent's behavior is a function of the environment you define. Reflect shows you which knobs to turn.
| examples/ | Annotated examples covering all features |
| field.structure | Full reference for every .field option |
| CLI.md | All commands and flags |
