feat: Stream A/B/C/D/E/F — state-witness CI fixes, platform IDE configs, standards registry, Tier 3 agent tooling, Tier 4 documentation, Tier 5 IDE integration + P3/P4 (closes all 25 gaps)#132
Conversation
Agent-Logs-Url: https://github.com/aidoruao/orthogonal-engineering/sessions/7ee5c585-1ead-4686-9ac3-ed0128905e16 Co-authored-by: aidoruao <174227749+aidoruao@users.noreply.github.com>
Agent-Logs-Url: https://github.com/aidoruao/orthogonal-engineering/sessions/7ee5c585-1ead-4686-9ac3-ed0128905e16 Co-authored-by: aidoruao <174227749+aidoruao@users.noreply.github.com>
Agent-Logs-Url: https://github.com/aidoruao/orthogonal-engineering/sessions/7ee5c585-1ead-4686-9ac3-ed0128905e16 Co-authored-by: aidoruao <174227749+aidoruao@users.noreply.github.com>
|
@copilot Devin ai: "You can keep PR #132 open and have Copilot keep pushing commits to the same Here's a comment you can paste directly on PR #132 for @copilot Continue on this branch. Next batch: Tier 3 Agent Tooling (gaps #13–#17 from `gap anaylsis devin 1a 4-17-26`).
Create these 5 tools — NO stubs, NO `pass` bodies, NO `NotImplementedError`, all fully functional, all `Fraction`-only, all with `falsifies_if`, all returning `(bool, ProofObject)` where applicable, all `mypy --strict` compatible, all with pytest tests.
1. **`tools/onboard_agent.py`** — Unified onboarding script. Detects agent type via CLI arg (`--agent copilot|claude|devin|kimi|aider|cursor|windsurf|cline|continue`) or env var. Runs the sequence: SOP handshake → context load (reads `STANDARDS_REGISTRY.json`, filters by `--scope`) → bar exam → consent log. Outputs a single context block to stdout. Must work with `tools/standards_check.py` already created in this PR.
2. **`tools/context_window_estimator.py`** — Estimates token cost of reading files/directories. Takes `--path src/domains/d_aerospace/` or `--path COPILOT_ONBOARDING.md` and outputs estimated token count (use ~4 chars per token heuristic, or tiktoken if available). Agents with limited context windows (Kimi 220k, Copilot ~128k) use this to plan what to read.
3. **`tools/since_last_session.py`** — Takes `--since-commit <SHA>` or `--since-row <N>`. Queries `AGENT_FEED.md` and `git log` to show what changed: new commits, new feed rows, new/modified files, new domains, new standards. Lets a returning agent catch up without reading the entire feed.
4. **`tools/arxiv_paper_template.py`** — Generates a LaTeX template for an arXiv submission from repo metrics. Pulls domain count, axiom count, case study count, test count, Merkle root from the repo state. Outputs a `.tex` file to `output/`. Follow the pattern already in the repo (see the `generate_latex()` function in `Devin Ai Architectural Coding task for Kimi Code cli session c223de88-00e9-49d6-b88d-0129697414b7 1a 4-9-26.txt` lines 992-1017).
5. **`tools/agent_health_check.py`** — Unified health check. Validates: Python version ≥3.10, venv active, required files exist (list from `STANDARDS_REGISTRY.json`), `consent_log.jsonl` integrity, Merkle roots valid, Popperian audit passing, `AGENT_FEED.md` chain intact (delegates to `generate_feed_entry.py --verify`), 0 float violations. Single exit code: 0=healthy, 1=failures found. JSON output with per-check results.
Add a consent log entry first. Add pytest tests for each tool in `tests/`. Update `STANDARDS_REGISTRY.json` with any new standards these tools enforce.
Reference: gap analysis file `gap anaylsis devin 1a 4-17-26` lines 54-62.After Tier 3 is done, the next comment would cover Tier 4 (3 docs: |
Agent-Logs-Url: https://github.com/aidoruao/orthogonal-engineering/sessions/eca24a1f-e686-4a81-b58f-c165406706fa Co-authored-by: aidoruao <174227749+aidoruao@users.noreply.github.com>
Agent-Logs-Url: https://github.com/aidoruao/orthogonal-engineering/sessions/eca24a1f-e686-4a81-b58f-c165406706fa Co-authored-by: aidoruao <174227749+aidoruao@users.noreply.github.com>
…iv_paper_template) Agent-Logs-Url: https://github.com/aidoruao/orthogonal-engineering/sessions/eca24a1f-e686-4a81-b58f-c165406706fa Co-authored-by: aidoruao <174227749+aidoruao@users.noreply.github.com>
Tier 3 complete on this branch (commit 5 tools created (gaps #13–17), all stub-free, Fraction-only, mypy-strict compatible, 55/55 pytest tests pass:
Ready for Tier 4 ( |
|
@copilot Devin ai: "PR #132 is looking great. Here's what Tier 3 delivered and the Tier 4 comment to paste. Tier 3 ReviewCommit
Commit
All solid. No stubs, Tier 4 Comment for @copilotPaste this on PR #132: @copilot Continue on this branch. Next batch: Tier 4 Documentation (gaps #18–#20 from `gap anaylsis devin 1a 4-17-26`).
Create these 3 documents. Technical register only — no liturgical prose, no devotional language. Theological terms are TECHNICAL TERMS with precise architectural meanings. All markdown has YAML frontmatter with `tags` and `register: technical`. Add a consent log entry first.
### 1. `GLOSSARY.md` (Gap #18)
Machine-readable glossary mapping theological/framework terms to engineering terms. `SOP_AI_HANDSHAKE.md` Section 4 has a partial table (12 entries) — extract those and expand to cover ALL terms used in the repo. Minimum 40 entries. Format:
```markdown
| Term | Engineering Definition | Mathematical Definition | Source File | falsifies_if |Must include at minimum:
2.
|
…O_INSTRUCTIONS.md - Replace GLOSSARY.md with Tier 4 spec-compliant version: YAML frontmatter, technical register, 50-entry table covering all required terms (roles, operators, architectural patterns, forbidden anti-patterns) - Create AGENT_CAPABILITIES_MATRIX.md: 13-agent capability table, multi-agent workflow triangle section, context window planning, onboarding commands - Create CROSS_REPO_INSTRUCTIONS.md: covers all 3 covenant repos, cross-repo invariant consistency, GAP-4 history and regression prevention, Merkle binding procedure, standards applicability table, coordinated commit workflow - Add tests/test_tier4_docs.py: 12 tests with Popperian falsifies_if docstrings; all 12 pass - Update STANDARDS_REGISTRY.json: add T4-001, T4-002, T4-003 standards; total_standards updated to 33 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: aidoruao <174227749+aidoruao@users.noreply.github.com>
…d, CROSS_REPO_INSTRUCTIONS.md + 12 tests (gaps #18-20) Agent-Logs-Url: https://github.com/aidoruao/orthogonal-engineering/sessions/e6c1efe8-c086-43df-9200-4497bbcb238a Co-authored-by: aidoruao <174227749+aidoruao@users.noreply.github.com>
Tier 4 complete on this branch (commit 3 documents created (gaps #18–20), all with YAML frontmatter and
Ready for Tier 5 ( |
|
@copilot devin ai: "Tier 4 is confirmed done. Two commits delivered:
Here is the complete Tier 5 + P3/P4 @copilot Continue on this branch. Final batch: Tier 5 IDE Integration (gaps #21–#25 from `gap anaylsis devin 1a 4-17-26`) plus P3/P4 state witness items. This closes all 25 gaps in the gap analysis.
Add a consent log entry first. NO stubs, NO `pass` bodies, all functional, `mypy --strict` compatible where applicable. Technical register only.
### 1. `.vscode/launch.json` (Gap #21)
Debug configurations for key entry points. Include at minimum these launch configs:
- **Run Kingdom OS Entry** — runs `kingdom_os_entry.py`
- **Run Popperian Audit** — runs `audit/popperian_audit.py`
- **Run Bar Exam** — runs `pr50_bar_exam/` bar exam entry point
- **Run Merkle Verification** — runs `merkle.py --verify`
- **Run Agent Health Check** — runs `tools/agent_health_check.py`
- **Run Standards Check** — runs `tools/standards_check.py --verify`
- **Run Feed Integrity Check** — runs `tools/state_witness/generate_feed_entry.py --verify`
- **Run Onboard Agent (Copilot)** — runs `tools/onboard_agent.py --agent copilot`
- **Run All Tests** — runs `pytest tests/ -v`
- **Run OE Engine CLI** — runs `oe_engine/cli.py` with a sample query
All configs should use `"python"` type, `"${workspaceFolder}"` as cwd, and reference the correct relative paths. Use `"justMyCode": false` so debugger steps into library code when needed.
### 2. `.vscode/tasks.json` (Gap #22)
Task definitions for common operations. Include at minimum:
- **Popperian Audit** — `python audit/popperian_audit.py`
- **Merkle Verify** — `python merkle.py --verify`
- **Feed Integrity** — `python tools/state_witness/generate_feed_entry.py --verify`
- **Agent Health Check** — `python tools/agent_health_check.py --json`
- **Standards Verify** — `python tools/standards_check.py --verify`
- **Run All Tests** — `pytest tests/ -v`
- **Context Window Estimate (domains)** — `python tools/context_window_estimator.py --path src/domains/ --json`
- **Since Last Session** — `python tools/since_last_session.py --since-row 180 --json` (placeholder row number)
- **Generate arXiv Template** — `python tools/arxiv_paper_template.py --dry-run`
- **Onboard Agent** — `python tools/onboard_agent.py --agent copilot --json`
Use `"type": "shell"` and `"group": "test"` for test tasks, `"group": "build"` for audit/verify tasks.
### 3. `pyrightconfig.json` (Gap #23)
Pyright/Pylance type checking config matching the `mypy --strict` requirement from `.github/copilot-instructions.md` line 32. Settings:
```json
{
"include": ["src", "kernel", "axioms", "oe_engine", "tools", "tests", "audit"],
"exclude": ["**/__pycache__", ".venv", "node_modules"],
"reportMissingImports": true,
"reportMissingTypeStubs": false,
"reportGeneralClassIssues": true,
"reportOptionalMemberAccess": true,
"reportUnusedImport": true,
"pythonVersion": "3.10",
"pythonPlatform": "Linux",
"typeCheckingMode": "strict"
}Adjust as needed so that the existing codebase does not produce more than ~50 errors (the goal is progressive strictness, not blocking CI). Add a comment at the top explaining the intent. 4.
|
Agent-Logs-Url: https://github.com/aidoruao/orthogonal-engineering/sessions/29f8ea9c-8058-4403-88cb-57d3284b8726 Co-authored-by: aidoruao <174227749+aidoruao@users.noreply.github.com>
Co-authored-by: aidoruao <174227749+aidoruao@users.noreply.github.com>
…s pass) Agent-Logs-Url: https://github.com/aidoruao/orthogonal-engineering/sessions/29f8ea9c-8058-4403-88cb-57d3284b8726 Co-authored-by: aidoruao <174227749+aidoruao@users.noreply.github.com>
…ode review) Agent-Logs-Url: https://github.com/aidoruao/orthogonal-engineering/sessions/29f8ea9c-8058-4403-88cb-57d3284b8726 Co-authored-by: aidoruao <174227749+aidoruao@users.noreply.github.com>
Tier 5 complete on this branch (commits All 25 gaps from the gap analysis + all P0–P4 state witness items are now closed.
32/32 Tier 5 tests pass ( |
| data = json.loads(AUDIT_REPORT.read_text(encoding="utf-8")) | ||
| passed = int(data.get("passed", 0)) | ||
| total = int(data.get("total", 0)) | ||
| return f"{passed}/{total}" |
There was a problem hiding this comment.
🔴 Wrong JSON keys read from POPPERIAN_AUDIT_REPORT.json — always returns "0/0"
get_popperian_pass_rate() reads data.get("passed", 0) and data.get("total", 0) from the top level of POPPERIAN_AUDIT_REPORT.json. However, the actual JSON structure nests these under summary with different key names: data["summary"]["passing_domains"] and data["summary"]["total_domains"]. Since passed and total don't exist at the top level, both default to 0, making this function always return "0/0" instead of the correct "170/170". Confirmed by running the function directly.
| data = json.loads(AUDIT_REPORT.read_text(encoding="utf-8")) | |
| passed = int(data.get("passed", 0)) | |
| total = int(data.get("total", 0)) | |
| return f"{passed}/{total}" | |
| data = json.loads(AUDIT_REPORT.read_text(encoding="utf-8")) | |
| summary = data.get("summary", {}) | |
| passed = int(summary.get("passing_domains", 0)) | |
| total = int(summary.get("total_domains", 0)) |
Was this helpful? React with 👍 or 👎 to provide feedback.
| data = json.loads(AUDIT_REPORT.read_text(encoding="utf-8")) | ||
| passed = data.get("passed", "?") | ||
| total = data.get("total", "?") | ||
| detail = f"{passed}/{total} domains pass" |
There was a problem hiding this comment.
🟡 Wrong JSON keys read from audit report — detail always shows "?/? domains pass"
check_popperian_audit() reads data.get("passed", "?") and data.get("total", "?") from the top level of POPPERIAN_AUDIT_REPORT.json, but the actual keys are data["summary"]["passing_domains"] and data["summary"]["total_domains"]. Since the top-level keys don't exist, the detail string always falls back to "?/? domains pass" instead of showing the actual pass rate. The check's pass/fail decision (ok = result.returncode == 0) is unaffected, but the reported detail is always wrong.
| data = json.loads(AUDIT_REPORT.read_text(encoding="utf-8")) | |
| passed = data.get("passed", "?") | |
| total = data.get("total", "?") | |
| detail = f"{passed}/{total} domains pass" | |
| data = json.loads(AUDIT_REPORT.read_text(encoding="utf-8")) | |
| summary = data.get("summary", {}) | |
| passed = summary.get("passing_domains", "?") | |
| total = summary.get("total_domains", "?") |
Was this helpful? React with 👍 or 👎 to provide feedback.
| python tools/state_witness/generate_feed_entry.py --verify \ | ||
| 2>/tmp/verify_stderr.txt; \ | ||
| VERIFY_EXIT=$?; \ | ||
| if [ $VERIFY_EXIT -ne 0 ]; then \ | ||
| python tools/state_witness/alert_on_failure.py \ | ||
| --exit-code $VERIFY_EXIT \ | ||
| --stderr-file /tmp/verify_stderr.txt \ | ||
| --alerts-file state_witness_alerts.jsonl; \ | ||
| fi; \ | ||
| exit $VERIFY_EXIT |
There was a problem hiding this comment.
🔴 set -e in GitHub Actions prevents alert_on_failure.py from ever running
The verify-feed-integrity job step uses python ... --verify 2>/tmp/verify_stderr.txt; VERIFY_EXIT=$?; to capture the exit code, then conditionally runs alert_on_failure.py. However, GitHub Actions runs bash with set -eo pipefail by default. When the verify command exits non-zero, set -e immediately terminates the shell before VERIFY_EXIT=$? executes, so alert_on_failure.py never runs — the exact scenario it was designed for. Confirmed with bash --noprofile --norc -eo pipefail -c 'python3 -c "import sys; sys.exit(1)"; echo unreachable' which exits without printing.
Fix pattern
Replace cmd; VERIFY_EXIT=$? with cmd || VERIFY_EXIT=$? or wrap in if ! cmd; then ...; fi to avoid set -e termination.
| python tools/state_witness/generate_feed_entry.py --verify \ | |
| 2>/tmp/verify_stderr.txt; \ | |
| VERIFY_EXIT=$?; \ | |
| if [ $VERIFY_EXIT -ne 0 ]; then \ | |
| python tools/state_witness/alert_on_failure.py \ | |
| --exit-code $VERIFY_EXIT \ | |
| --stderr-file /tmp/verify_stderr.txt \ | |
| --alerts-file state_witness_alerts.jsonl; \ | |
| fi; \ | |
| exit $VERIFY_EXIT | |
| python tools/state_witness/generate_feed_entry.py --verify | |
| 2>/tmp/verify_stderr.txt | |
| && VERIFY_EXIT=0 || VERIFY_EXIT=$?; | |
| if [ $VERIFY_EXIT -ne 0 ]; then | |
| python tools/state_witness/alert_on_failure.py \ | |
| --exit-code $VERIFY_EXIT \ | |
| --stderr-file /tmp/verify_stderr.txt \ | |
| --alerts-file state_witness_alerts.jsonl; | |
| fi; | |
| exit $VERIFY_EXIT |
Was this helpful? React with 👍 or 👎 to provide feedback.
| "authority": "@aidoruao / Orthogonal Engineering Framework", | ||
| "standard": "Yeshua", | ||
| "last_updated": "2026-04-18", | ||
| "total_standards": 33 |
There was a problem hiding this comment.
🟡 STANDARDS_REGISTRY.json _meta.total_standards is 33 but actual count is 45
_meta.total_standards is hardcoded to 33 but the standards array contains 45 entries. This incorrect metadata is surfaced by tools/standards_check.py:368 which prints meta.get('total_standards', len(standards)) — it reads the stale metadata value of 33 instead of the actual count. Users running python tools/standards_check.py see "Total standards : 33" when there are 45.
| "total_standards": 33 | |
| "total_standards": 45 |
Was this helpful? React with 👍 or 👎 to provide feedback.
| # git_ref is resolved but intentionally excluded from the returned dict. | ||
| # The ledger schema has exactly 8 columns; git_ref is not one of them. | ||
| # Resolving it here keeps fallback logic in one place for callers that need it. | ||
| _resolved_git_ref = git_ref if git_ref is not None else _get_git_ref() | ||
| _ = _resolved_git_ref # not stored in entry to avoid ghost-field confusion |
There was a problem hiding this comment.
📝 Info: git_ref removal from build_feed_entry is a deliberate API contract change
The removal of git_ref from the build_feed_entry() return dict (tools/state_witness/generate_feed_entry.py:175-184) is an intentional breaking change to the function's API contract. The old code included git_ref in the returned dict; the new code resolves it but discards it (_ = _resolved_git_ref). A new test test_entry_does_not_contain_git_ref at tests/test_pr40_state_witness.py:112-122 explicitly asserts this. Since git_ref was never written to the AGENT_FEED.md ledger (it wasn't in the 8-column schema), this is a cleanup that removes a ghost field. I searched all callers and confirmed no code outside tests ever accessed entry["git_ref"] from the returned dict, so this is safe.
Was this helpful? React with 👍 or 👎 to provide feedback.
| CONTEXT_WINDOWS: dict[str, int] = { | ||
| "copilot": 128_000, | ||
| "claude": 200_000, | ||
| "devin": 200_000, |
There was a problem hiding this comment.
🚩 Context window for Devin is inconsistent across code and documentation
The Devin AI context window is listed as 200_000 tokens in tools/onboard_agent.py:58 and tools/context_window_estimator.py:50, but AGENT_CAPABILITIES_MATRIX.md:15 says "~128k tokens" and .ai_registry.json agents array lists "context_window": 128000. This inconsistency means the token budget estimator tools will use 200k as Devin's budget (allowing more files to be read) while the documentation advises 128k. Not a code logic bug per se, but could lead to agents consuming more context than the actual Devin session supports.
Was this helpful? React with 👍 or 👎 to provide feedback.
| try: | ||
| result = subprocess.run( | ||
| cmd, | ||
| shell=True, | ||
| capture_output=True, | ||
| text=True, | ||
| cwd=str(REPO_ROOT), | ||
| timeout=60, |
There was a problem hiding this comment.
📝 Info: shell=True in standards_check.py run_enforcement is intentional but carries injection risk
tools/standards_check.py:222 uses shell=True in subprocess.run() to execute enforcement commands from STANDARDS_REGISTRY.json. This is intentional because enforcement commands like grep -rn 'float(\|isclose(' src/domains/*/invariants.py require shell glob expansion and piping. The risk is mitigated by the fact that the commands come from a repository-controlled JSON file (not user input). However, if an attacker ever gains write access to STANDARDS_REGISTRY.json, they could inject arbitrary shell commands via the enforcement_command field. This is an accepted design trade-off for the current trusted-repo model.
Was this helpful? React with 👍 or 👎 to provide feedback.
|
|
||
| from tools.agent_health_check import ( # noqa: E402 | ||
| check_consent_log as hc_check_consent_log, | ||
| check_feed_integrity, |
There was a problem hiding this comment.
📝 Info: check_feed_integrity import shadowed in test_tier3_tools.py — no functional impact
tests/test_tier3_tools.py:38 imports check_feed_integrity from tools.onboard_agent, then tests/test_tier3_tools.py:611 imports check_feed_integrity from tools.agent_health_check without an alias. The second import silently shadows the first. This has no functional impact because the onboard_agent version is only used indirectly via run_env_checks(), and the direct call at line 759 correctly uses the agent_health_check version. Still, it would be cleaner to alias one of them (similar to how check_python_version is aliased as hc_check_python_version).
Was this helpful? React with 👍 or 👎 to provide feedback.
| concurrency: | ||
| group: state-witness-write | ||
| cancel-in-progress: false |
There was a problem hiding this comment.
📝 Info: Workflow concurrency group addition is a sound fix for race conditions
The addition of concurrency: group: state-witness-write with cancel-in-progress: false at .github/workflows/pr40-canonical-presence.yml:37-39, combined with fetch-depth: 0 (line 55) and git pull --rebase origin main (line 95), is a well-structured fix for the concurrent push race condition. The serialization group ensures only one write-back runs at a time, the full clone prevents non-fast-forward rejections, and the rebase incorporates any concurrent changes. The cancel-in-progress: false is correctly chosen to ensure every push eventually gets a witness entry rather than being cancelled.
Was this helpful? React with 👍 or 👎 to provide feedback.
| "passing_domains": 170, | ||
| "total_domains": 170 |
There was a problem hiding this comment.
📝 Info: Popperian audit report domain count jumped from 162 to 170 — new domains added
The audit report summary changed from 162 to 170 passing domains (audit/POPPERIAN_AUDIT_REPORT.json:9711-9712). The diff adds findings for 10 new d_arxiv_* domains (e.g., d_arxiv_case_grounded_evidence, d_arxiv_do_we_really, etc.) and removes findings for d_intl_criminal and d_intl_humanitarian (which appear to have been consolidated into the existing d_international_criminal and d_international_humanitarian domains — the check_case_has_evidence function was added to d_international_criminal and check_harm_objective_ratio to d_international_humanitarian). Net change: +10 arxiv domains, -2 merged domains, +1 function added = 162 + 8 = 170. This is consistent.
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
Six-stream implementation based on Devin gap analysis (2026-04-17) and Copilot chain-integrity analysis of commit
a27ff75. Closes all 25 gaps from the gap analysis.Stream A — State Witness CI Fixes (P0/P1/P2/P3/P4)
Addresses all critical/high findings from the chain-integrity analysis.
P0: Race condition and shallow-clone write-back (OPS-1, OPS-2, OPS-3)
.github/workflows/pr40-canonical-presence.ymlconcurrency: group: state-witness-write, cancel-in-progress: false— serializes write-backs so concurrent pushes tomainno longer produce non-fast-forward rejectionsfetch-depth: 1→fetch-depth: 0in bothstate-witnessandverify-feed-integrityjobsgit pull --rebase origin mainbeforegit pushin the write-back step as a second line of defenseAssert row count increasedstep after write-back (monotonic guard, OBS-3)P1: Ghost field + unknown SHA (INT-1, INT-2)
tools/state_witness/generate_feed_entry.pygit_reffrom thebuild_feed_entry()return dict. The field was silently discarded at write time (_entry_to_rownever included it) while present in memory, creating an ambiguous producer/consumer contract. The docstring now explicitly documents thatgit_refis intentionally excluded from the 8-column ledger schema.tests/test_pr40_state_witness.py— 40 tests (was 32), all passtest_entry_does_not_contain_git_ref— explicitly asserts the ghost field is absent (INT-1)TestUnknownCommitSHA— documents INT-2:is_duplicate()always returnsFalsefor"unknown"commit SHA, which allows unbounded row append in broken CI environmentsTestMassBootstrap— covers the E12 bootstrap scenario (commita27ff75shipped 183 pre-populated rows; this test verifies that pattern works correctly) and the INT-3 genesis rowprev_entry_hash=""sentinelP3: Failure alerting (OBS-1, OBS-2)
tools/state_witness/alert_on_failure.pyverify-feed-integrityjob inpr40-canonical-presence.ymlP4: Chain gap documentation (INT-003)
docs/AGENT_FEED_NOTES.mdprev_entry_hash=""sentinel and any known chain gapsSTANDARDS_REGISTRY.jsonStream B — Platform / IDE Config Files
Tier 1 (Agent platform files)
CLAUDE.mdDEVIN.md.cursorrules.windsurfrules.aider.conf.yml.aiderignoreAGENT_FEED.md, consent log, and state files.continue/config.jsoncline_docs/README.mdTier 2 (Dev environment files)
.devcontainer/devcontainer.json.vscode/settings.json.vscode/extensions.json.editorconfigmypy.ini--strictmypy configuration wired to.vscode/settings.jsonStream C — Standards Registry
STANDARDS_REGISTRY.json45 machine-readable standards (30 original + T3-001..T3-005 tool standards + T4-001..T4-003 documentation standards + T5-001..T5-005 IDE/MCP/registry standards + INT-003 chain gap standard), each with:
id(e.g.CS-001,YS-003,WF-002,T3-001,T4-001,T5-001)category(yeshua_axiom, code_standard, behavioral_constraint, documentation_register, quality_gate, workflow_constraint, integrity, tool_standard)rule— human-readable rule statementenforcement_command— shell command to verify the rule (or null for manual checks)enforcement_passes_on—"exit_zero"/"no_output"/"has_output"(semantics for grep-based checks)falsifies_if— Popperian falsifiability conditionseverity— critical / high / medium / lowscope— glob pattern for applicable pathstools/standards_check.pyCLI query and verification tool:
Verified PASS: workflow_constraint 3/3, integrity 2/2, tool_standard 5/5, documentation standards 3/3, IDE standards 5/5.
Stream D — Tier 3 Agent Tooling (gaps #13–17)
Five fully functional tools — no stubs, no
passbodies,Fraction-only arithmetic,mypy --strictcompatible, all with pytest tests.tools/onboard_agent.pyUnified onboarding script. Detects agent type via
--agent copilot|claude|devin|kimi|aider|cursor|windsurf|cline|continueorOE_AGENTenv var. Runs env checks (Python version, required files, consent log, feed integrity), loadsSTANDARDS_REGISTRY.jsonfiltered by--scope, and outputs a context block to stdout.tools/context_window_estimator.pyEstimates token cost of reading files or directories. Uses
Fraction(chars) / Fraction(4)for exact heuristic arithmetic (tiktoken optional). Knows context window sizes for each supported agent (Kimi=220k, Claude=200k, Copilot=128k). Exits 1 if--budget Nis exceeded.tools/since_last_session.pySession catch-up for returning agents. Two modes:
--since-commit <SHA>and--since-row <N>. Reports new commits, changed files, new domains, new standards, and new feed rows.tools/arxiv_paper_template.pyGenerates a LaTeX arXiv submission template from live repo metrics. Pulls domain count, test count, axiom count, standards count, Merkle root, feed row count, and Popperian audit pass rate.
tools/agent_health_check.pyUnified health check with exit code 0 (healthy) / 1 (failures found). Nine checks return
(bool, ProofObject). Supports--fastand--json.Tests
tests/test_tier3_tools.py— 55 tests, 55 pass (0 failures, 0 skips).Stream E — Tier 4 Documentation (gaps #18–20)
Three technical-register documentation files — YAML frontmatter with
tagsandregister: technical, no liturgical prose.GLOSSARY.md(Gap #18)50-entry machine-readable glossary. Covers all 6 Σ_theo operators, all role terms, all architectural patterns, forbidden anti-patterns, and infrastructure terms. Format:
| Term | Engineering Definition | Mathematical Definition | Source File | falsifies_if |AGENT_CAPABILITIES_MATRIX.md(Gap #19)13-agent capability table. Includes multi-agent workflow triangle section (Devin=planning, Kimi=execution, Copilot=review, NotebookLM=memory), context window planning guidance, and per-agent onboarding commands.
CROSS_REPO_INSTRUCTIONS.md(Gap #20)Cross-repository instructions covering all 3 repositories (orthogonal-engineering, sigma-lora-covenant, truthsystems-mod). Includes GAP-4 history + regression prevention, cross-repo Merkle binding, and standards applicability table.
Tests
tests/test_tier4_docs.py— 12 tests with PopperianFalsifies if:docstrings.Stream F — Tier 5 IDE Integration (gaps #21–25)
.vscode/launch.json(Gap #21)12 debug configurations: Kingdom OS Entry, Popperian Audit, Bar Exam, Merkle Verification, Agent Health Check, Standards Check, Feed Integrity Check, Onboard Agent (Copilot), All Tests, OE Engine CLI, Context Estimator, and more. All use
"python"type,"${workspaceFolder}"as cwd,"justMyCode": false..vscode/tasks.json(Gap #22)14 task definitions grouped into
build,test, andverifygroups: Popperian Audit, Standards Check, Feed Integrity, Health Check, Merkle Verify, Pytest (all / fast), Context Window, Since Last Session, arXiv Template, Onboard Agent variants, and Float Violations check.pyrightconfig.json(Gap #23)Pyright strict-mode configuration matching
mypy --strict. Includes 7 source directories,typeCheckingMode: strict, andpythonVersion: "3.11".mcp/README.md+mcp/oe-basic.mcp.json(Gap #24)MCP server descriptor with 7 tools (
popperian_audit,standards_check,feed_integrity,health_check,onboard_agent,context_estimator,since_last_session) and 6 resources (standards registry, feed, glossary, Merkle root, consent log, cross-repo instructions)..ai_registry.json(Gap #25)Updated with
agentsarray (13 agents),wardensarray (9 wardens), and addedstandards_registry,glossary, andcross_repo_instructionstop-level references.Tests
tests/test_tier5_ide.py— 32 tests, 32 pass (0 failures, 0 skips).Verification
Consent Log
pr47_stewardship/witness/consent_log.jsonl— four entries appended:2026-04-18T02:27:01Z— Stream A/B/C (candidate_id: copilot-pr-stream-abc-20260418)2026-04-18T04:xx:xxZ— Tier 3 tooling (candidate_id: copilot-tier3-tools-20260418)2026-04-18T04:xx:xxZ— Tier 4 documentation (candidate_id: copilot-tier4-docs-20260418)2026-04-18T04:xx:xxZ— Tier 5 IDE integration + P3/P4 (candidate_id: copilot-tier5-20260418)