Skip to content

audit(stage-a): hashed investigative taxonomy + gap-analysis JSONL#141

Merged
aidoruao merged 4 commits into
mainfrom
devin/1776650369-stage-a-hashed-taxonomy
Apr 20, 2026
Merged

audit(stage-a): hashed investigative taxonomy + gap-analysis JSONL#141
aidoruao merged 4 commits into
mainfrom
devin/1776650369-stage-a-hashed-taxonomy

Conversation

@devin-ai-integration
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot commented Apr 20, 2026

Summary

Stage A of the "finish everything" multi-PR rollout: produce a deterministic, hash-anchored investigative taxonomy of the repository's outstanding engineering debt, mapped into six namespaces.

New: tools/generate_hashed_taxonomy.py — walks the repo and emits a JSONL audit keyed by six namespaces (aerospace, floor, yeshua, math_popperian, secular, projection) and nine issue types (TODO, FIXME, HACK, STUB_PASS, STUB_NOTIMPL, FLOAT_CALL, FLOAT_ANNOT, CHECK_MISSING_PROOFOBJECT, CHECK_MISSING_FALSIFIES_IF_PAIR). Every entry carries a sha256_evidence + entry_sha256, and the top-level summary carries audit_sha256 committed over (summary, ordered_entry_hashes) — content-only, so the commitment is stable across output paths.

Artifacts committed under audits/:

  • audits/hashed_taxonomy_20260420.jsonl — 3230 entries, one per finding
  • audits/gap_analysis_20260420.json — summary, audit_sha256 = 7c32fdcad0f1fc02019eb8a1034f7207b97bbb44ffe3b14f2452513b432ebae9

Run against current tree:

files_scanned        5027
issue_count_total    3230
classified_fraction  941/1615
by_namespace         aerospace=188  floor=127  math_popperian=1483  projection=121  secular=89  yeshua=908
by_type              TODO=275 FIXME=3 HACK=1 STUB_PASS=335 STUB_NOTIMPL=75 FLOAT_CALL=333 FLOAT_ANNOT=1032
                     CHECK_MISSING_PROOFOBJECT=413 CHECK_MISSING_FALSIFIES_IF_PAIR=763

Also fixed (incidental): STANDARDS_REGISTRY.json had a duplicated "total_standards" line with no comma separator, which made python tools/standards_check.py --list crash with JSONDecodeError. One-line fix; registry now parses cleanly and --verify runs (and shows separately tracked real failures that Stage B will address).

Consent log entry: appended per SOP-AI-HANDSHAKE-1.0 before any repo-scope change.

Review & Testing Checklist for Human

  • Skim audits/gap_analysis_20260420.json — confirm the namespace/type counts look like a reasonable picture of current debt, not an over- or under-count you'd dispute
  • Spot-check 3–5 entries in audits/hashed_taxonomy_20260420.jsonl by running grep '<path>' audits/hashed_taxonomy_20260420.jsonl | head — verify the line and snippet match what's actually in the file
  • Re-run python tools/generate_hashed_taxonomy.py --out /tmp/verify.jsonl --summary /tmp/verify.json and confirm audit_sha256 matches 7c32fdcad0f1fc02019eb8a1034f7207b97bbb44ffe3b14f2452513b432ebae9
  • Confirm python tools/standards_check.py --list no longer crashes with JSONDecodeError on the STANDARDS_REGISTRY.json total_standards bug
  • Test plan: python -m pytest tests/test_hashed_taxonomy.py -v (8 tests, all pass locally) — covers canonical JSON, sha256 format, line/function scanners, determinism

Notes

cc @aidoruao

Link to Devin session: https://app.devin.ai/sessions/36c540710d5c487ab6c5f61be5879aa3


Open in Devin Review

@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

Original prompt from Tony

finish everything "Skip to content
aidoruao
orthogonal-engineering
Repository navigation
Code
Issues
13
(13)
Pull requests
17
(17)
Agents
Discussions
Actions
Projects
Wiki
Security and quality
Insights
Settings
Commits
Branch selector
User selector
Datepicker
Commit History
Commits on Apr 19, 2026
chore(pr40): append state witness entry [skip ci]
github-actions[bot]
github-actions[bot]
committed
2 minutes ago
Add files via upload
aidoruao
aidoruao
authored
2 minutes ago
·
Verified
chore(pr40): append state witness entry [skip ci]
github-actions[bot]
github-actions[bot]
committed
43 minutes ago
Kimi Code 9184eeb4-ffb0-472d-9a81-c9babff132e5 1a 4-19-26.txt
aidoruao
aidoruao
authored
44 minutes ago
·
Verified
Commits on Apr 18, 2026
chore(pr40): append state witness entry [skip ci]
github-actions[bot]
github-actions[bot]
committed
yesterday
gpt 5 mini copilot coding tasks for kimi cli queue 4-19-26 1a.txt
aidoruao
aidoruao
authored
yesterday
·
Verified
chore(pr40): append state witness entry [skip ci]
github-actions[bot]
github-actions[bot]
committed
yesterday
gpt 5 mini copilot gap anayslsis audit 1a 4-18-26.txt
aidoruao
aidoruao
authored
yesterday
·
Verified
chore(pr40): append state witness entry [skip ci]
github-actions[bot]
github-actions[bot]
committed
yesterday
Kimi Code 597e0d23-f404-4bdf-801f-64962ce0e722 2a-4a 4-18-26.txt
aidoruao
aidoruao
authored
yesterday
·
Verified
chore(pr40): append state witness entry [skip ci]
github-actions[bot]
github-actions[bot]
committed
yesterday
Merge pull request #140 from aidoruao/kimi/pr139-aerospace-floor-polymath-domains
aidoruao
aidoruao
authored
yesterday
·
Verified
Merge branch 'main' into kimi/pr139-aerospace-floor-polymath-domains
aidoruao
aidoruao
authored
yesterday
·
Verified
feat(ai-credit): Task 5-8 — AI co-author, CI, standards, onboarding
aidoruao
aidoruao
committed
yesterday
·
feat(forensic): Task 3 + Task 4 — forensic commit generator + verification testimony
aidoruao
aidoruao
committed
yesterday
·
feat(stat... (2455 chars truncated...)

@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Copy link
Copy Markdown
Contributor Author

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 7 potential issues.

Open in Devin Review

Comment thread tools/generate_hashed_taxonomy.py Outdated
Comment thread tools/generate_hashed_taxonomy.py
Comment thread STANDARDS_REGISTRY.json
"standard": "Yeshua",
"last_updated": "2026-04-18",
"last_updated": "2026-04-20",
"total_standards": 59
Copy link
Copy Markdown
Contributor Author

@devin-ai-integration devin-ai-integration Bot Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: STANDARDS_REGISTRY.json had invalid duplicate key — correctly fixed

The old file had both "total_standards": 59 (line 8) and "total_standards": 58 (line 9) — a duplicate JSON key that is technically undefined behavior per RFC 7159 (most parsers take the last value, so it read as 58). The PR correctly removes the duplicate, leaving only "total_standards": 59 which matches the actual count of 59 "id": entries in the standards array. The fix also resolves a missing comma that made the old JSON technically malformed between the two duplicate keys.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — the semantic change was intentional and matches reality:

$ python -c "import json; d=json.load(open('STANDARDS_REGISTRY.json')); print(len(d['standards']))"
59

The actual standards array contains 59 entries. The pre-existing dict had both "total_standards": 59 (line 8) and "total_standards": 58 (line 9), with the stray 58 being a leftover from an earlier edit. Python's json module silently keeps the last value, so the file previously reported 58 standards while actually containing 59 — a drift bug. Removing the duplicate key reconciles the metadata with ground truth.

Comment thread tools/generate_hashed_taxonomy.py Outdated
Comment thread tools/generate_hashed_taxonomy.py Outdated
Comment thread tools/generate_hashed_taxonomy.py Outdated
Comment on lines +437 to +449
doc = {
"schema": "OE-GAP-ANALYSIS-1.0",
"generated_at_utc": datetime.now(tz=timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ"),
"repo_root": str(REPO_ROOT),
"summary": summary,
"classified_fraction": f"{frac_classified.numerator}/{frac_classified.denominator}",
"audit_sha256": audit_sha256,
"jsonl_path": jsonl_label,
"entry_count": len(entries),
}
with out_path.open("w", encoding="utf-8") as fh:
fh.write(json.dumps(doc, sort_keys=True, indent=2, ensure_ascii=True) + "\n")
return audit_sha256
Copy link
Copy Markdown
Contributor Author

@devin-ai-integration devin-ai-integration Bot Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: audit_sha256 commitment boundary is well-designed but relies on summary dict not being mutated

The _write_summary function computes audit_sha256 from {"summary": summary, "ordered_entry_hashes": [...]} at line 447, then embeds that sha into the output doc. The summary dict is passed by reference from build_entries. If anything modified the summary dict between build_entries returning and _write_summary computing the hash, the commitment would be broken. Currently main() doesn't mutate it, so this is safe — but it's a fragile invariant that could break if future code modifies the summary before writing.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. The top-level generated_at_utc, repo_root, and jsonl_path fields made it ambiguous which parts of the document were covered by audit_sha256. New structure makes the commitment boundary explicit:

{
  "schema": "OE-GAP-ANALYSIS-1.0",
  "summary": {...},
  "classified_fraction": "941/1615",
  "audit_sha256": "34ed7b25...",
  "entry_count": 11027,
  "metadata": {
    "generated_at_utc": "2026-04-20T02:50:12Z",
    "jsonl_path": "audits/hashed_taxonomy_20260420.jsonl",
    "not_covered_by_audit_sha256": ["generated_at_utc", "jsonl_path"]
  }
}
  • repo_root dropped entirely — it was leaking absolute CI paths, not useful for consumers.
  • generated_at_utc + jsonl_path moved under metadata with an explicit not_covered_by_audit_sha256 list that names them.
  • Added test_gap_analysis_metadata_is_outside_commitment as a falsifier that asserts those fields are not at the top level and not_covered_by_audit_sha256 is present.

Two fresh runs now both produce audit_sha256 = 34ed7b25249c045c8274fe2b969986bc5a62839791581a59b86fb74bd5e5e3dd, unaffected by timestamp or output path.

devin-ai-integration Bot added a commit that referenced this pull request Apr 20, 2026
…, drop dead IssueHit, fix namespace sum, segregate non-deterministic metadata

Review feedback on PR #141:

1. 🔴 assert statement in tools/generate_hashed_taxonomy.py:282 violated
   .cursorrules / CLAUDE.md 'No assert' rule. Replaced with a single
   walrus-style assignment that also removes the redundant double regex
   search the other reviewer flagged.
2. 🔴 _RE_ASSERT regex was defined but no scanner emitted the issue
   type. Renamed to ASSERT_USE (the rule applies to all modules, not
   just check_* surface), gated on is_python to avoid false positives
   in prose, and wired into _scan_line_level. 8012 real assert hits
   surfaced across Python sources.
3. 📝 Removed the dead IssueHit dataclass + its dataclass import.
4. 📝 Added 'unclassified' to counts_by_namespace so per-namespace
   counts can sum to at least issue_count_total (they previously silently
   dropped unclassified hits).
5. 📝 Moved generated_at_utc / jsonl_path into a dedicated 'metadata'
   subkey with an explicit 'not_covered_by_audit_sha256' list, and
   dropped the non-portable repo_root field entirely.

Tests: 11 pass (added test_line_level_scanner_skips_python_only_patterns_for_non_python,
test_namespace_counts_account_for_unclassified,
test_gap_analysis_metadata_is_outside_commitment).

Regenerated artifacts: audit_sha256 = 34ed7b25249c045c8274fe2b969986bc5a62839791581a59b86fb74bd5e5e3dd
(deterministic across two runs; files_scanned=5028, issue_count_total=11027).

Co-Authored-By: Tony Ha <aidoruao@gmail.com>
Copy link
Copy Markdown
Contributor Author

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 5 new potential issues.

Open in Devin Review

Comment thread tools/generate_hashed_taxonomy.py Outdated
Comment thread tools/generate_hashed_taxonomy.py Outdated
Comment thread tools/generate_hashed_taxonomy.py Outdated
Comment thread tools/generate_hashed_taxonomy.py
Comment on lines +278 to +288
if is_python:
if _RE_STUB_PASS.match(line):
out.append((idx, "STUB_PASS", snippet))
if _RE_STUB_NOTIMPL.search(line):
out.append((idx, "STUB_NOTIMPL", snippet))
if _RE_FLOAT_CALL.search(line):
out.append((idx, "FLOAT_CALL", snippet))
if _RE_FLOAT_ANNOT.search(line):
out.append((idx, "FLOAT_ANNOT", snippet))
if _RE_ASSERT.match(line):
out.append((idx, "ASSERT_USE", snippet))
Copy link
Copy Markdown
Contributor Author

@devin-ai-integration devin-ai-integration Bot Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: The is_python gate correctly prevents false positives on non-Python files — a meaningful improvement

The prior version of _scan_line_level (visible in the git diff) applied all pattern detectors to all file types, which would flag pass, assert, float(, and NotImplementedError in Markdown prose, YAML configs, and JSON data files. The new is_python flag at tools/generate_hashed_taxonomy.py:278 correctly gates Python-only patterns, and the test test_line_level_scanner_skips_python_only_patterns_for_non_python at tests/test_hashed_taxonomy.py:82 validates this. The TODO/FIXME/HACK patterns remain ungated since they are meaningful across all text file types.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged — thanks for calling out the is_python gate explicitly. The test at tests/test_hashed_taxonomy.py:82 (test_line_level_scanner_skips_python_only_patterns_for_non_python) locks the invariant that STUB_PASS / STUB_NOTIMPL / FLOAT_CALL / FLOAT_ANNOT / ASSERT_USE are suppressed on non-Python files while TODO / FIXME / HACK remain active across all text types. No action needed here.

Copy link
Copy Markdown
Contributor Author

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 10 new potential issues.

Open in Devin Review

Comment thread tools/generate_hashed_taxonomy.py Outdated
Comment thread tools/generate_hashed_taxonomy.py Outdated
Comment thread tools/generate_hashed_taxonomy.py
Comment thread tools/generate_hashed_taxonomy.py
Comment thread tools/generate_hashed_taxonomy.py
Comment on lines +34 to +35
assert a == b
assert a == '{"a":1,"b":2}'
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Assert usage in test file is consistent with existing repo patterns

The rule files say "No assert" but also say "Tests runnable with pytest." Pytest's fundamental mechanism IS assert statements. I verified that existing tests like tests/test_epistemic_advanced.py (14 assert statements) and tests/test_falsification.py use assert freely. This is an established pattern in the repo that overrides the literal rule text for test files specifically. Not flagging as a bug.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged — tests intentionally use assert (standard pytest idiom and consistent with every other test file under tests/). The repo-wide "no assert" rule applies to production code, which is enforced by tools/standards_check.py (CS-004) against non-test paths. No change needed.

Comment on lines +388 to +400
entry: Dict[str, Any] = {
"id": f"OE-TAX-{ev_sha[:12]}",
"path": rel,
"line": line_no,
"issue_type": issue_type,
"severity": ISSUE_SEVERITY.get(issue_type, "low"),
"namespaces": list(ns_record),
"evidence": {"snippet": snippet},
"sha256_evidence": ev_sha,
"file_sha256": file_sha,
}
entry["entry_sha256"] = _sha256_text(_canonical_json(entry))
entries.append(entry)
Copy link
Copy Markdown
Contributor Author

@devin-ai-integration devin-ai-integration Bot Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: entry_sha256 self-referential hash pattern is correctly implemented

At tools/generate_hashed_taxonomy.py:399, the entry_sha256 is computed by hashing the canonical JSON of the entry dict BEFORE entry_sha256 is added to it. This is the standard self-referential commitment pattern: the hash covers all fields except itself, and verification works by removing entry_sha256, re-canonicalizing, and re-hashing. The implementation is correct and deterministic.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks — acknowledged. The self-referential commit does rely on _canonical_json (sorted keys, fixed separators, ensure_ascii=True) producing byte-identical output for the same logical content, and test_build_entries_is_deterministic pins that invariant by comparing sorted entry_sha256 sets across two runs.

Comment thread tools/generate_hashed_taxonomy.py Outdated
Comment thread tools/generate_hashed_taxonomy.py Outdated
Comment on lines +383 to +386
for line_no, issue_type, snippet in hits:
counts[issue_type] = counts.get(issue_type, 0) + 1
for ns in ns_record:
counts_by_namespace[ns] = counts_by_namespace.get(ns, 0) + 1
Copy link
Copy Markdown
Contributor Author

@devin-ai-integration devin-ai-integration Bot Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Namespace counts are intentionally non-partitioning (multi-labeling)

In build_entries at lines 385-386, each issue increments ALL namespace counters for the file it's in. A file classified under both "yeshua" and "math_popperian" will count each issue in BOTH namespaces. This means sum(issue_count_by_namespace.values()) >= issue_count_total. The test test_namespace_counts_account_for_unclassified explicitly verifies this with assert ns_total >= total, confirming this is intentional multi-labeling behavior, not a counting bug.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks — acknowledged. Multi-labeling is intentional: a single finding can belong to multiple namespaces (e.g., an aerospace floor invariant that also mentions Yeshua axioms). test_namespace_counts_account_for_unclassified asserts sum(namespace_counts) >= issue_count_total (note >=, not ==) to make the non-partitioning property a locked invariant rather than an accidental side effect. Happy to add a summary-document comment spelling this out in a follow-up Stage C pass if you'd like.

devin-ai-integration Bot added a commit that referenced this pull request Apr 20, 2026
…ANNOT return-type coverage

Devin Review round 3 findings on PR #141 (10 total comments):

BUG fixes (5):
- Add 'Falsifies if:' / 'falsifies_if:' docstring pair to _scan_line_level
- Add pair to _scan_check_function (with window-bleed invariant note)
- Add pair to build_entries (determinism + summary/entries agreement)
- Add pair to _write_summary (audit_sha256 determinism + metadata boundary)
- Add pair to main (clean-walk exit code + cross-run determinism)

FLAG fix (1):
- Broaden _RE_FLOAT_ANNOT from r':\s*float\b' to r'(?::|->)\s*float\b' so
  return-type annotations 'def f() -> float' are flagged alongside
  parameter/variable annotations. Add
  test_float_annot_regex_catches_return_type_annotations to lock the
  coverage invariant.

Cleanup (from ANALYSIS comment):
- Remove unused 'path' parameter from _scan_check_function; update the
  single internal caller and three call sites in tests. Pure dead-code
  removal; no behavior change.

Audit artifact regeneration:
- audit_sha256 deterministic across two consecutive runs:
  f73f70dc8ae70d990a47aba215ab6ed49d165ac0d8dc4cfe734048a1c5a45eb2
- issue_count_total: 11,082 -> 11,275 (+193 newly-detected '-> float'
  return-type annotations now covered by FLOAT_ANNOT).

Verification:
- pytest tests/test_hashed_taxonomy.py: 15/15 passing (was 14, +1 new).
- CS-003 (Falsifies-if title-case pair) passes on standards_check.
- _RE_FALS_TITLE remains case-sensitive (no IGNORECASE) per .cursorrules.

Co-Authored-By: Tony Ha <aidoruao@gmail.com>
devin-ai-integration Bot added a commit that referenced this pull request Apr 20, 2026
…n fix

CHECKPOINT_STAGES_A_THROUGH_G.md documents the full state of the 'finish
everything' campaign for cross-session continuity:

- Stage A (#141), B (#142), C (#143), F (#148), G (#149) — complete
- Stage D (housekeeping, 14 stale PRs + 13 bot issues) — pending
- Stage E (non-draft PR review for #91, #85, #26) — pending

The checkpoint lists exact resume commands, open threads, and the
five-command verification quartet that every resumed session should run
before taking new action.

STANDARDS_REGISTRY.json: drop a pre-existing duplicate 'total_standards'
key at lines 8-9 (59 vs 58) — broken JSON blocked standards_check --verify.
Kept the later value (58), which matched the most recent authoring intent.

Appended consent-log entry for this change.

Not enacting stages D/E in this session; resume from the checkpoint.

Co-Authored-By: Tony Ha <aidoruao@gmail.com>
devin-ai-integration Bot and others added 4 commits April 20, 2026 17:10
Adds tools/generate_hashed_taxonomy.py which walks the repo and emits
a deterministic JSONL audit of occurrences mapped to 6 namespaces
(aerospace, floor, yeshua, math_popperian, secular, projection) plus
issue markers (TODO/FIXME/HACK, pass/NotImplementedError stubs,
float() usages, check_* without Tuple[bool, ProofObject] return,
check_* missing both 'Falsifies if:' and 'falsifies_if:').

- Every entry carries a sha256 over canonical JSON of the entry
- Top-level audit_sha256 commits over (summary, ordered entry hashes);
  content-only, so two runs over same tree produce same commitment
- Run against current tree: audit_sha256=7c32fdcad0f1fc02019eb8a1034f7207b97bbb44ffe3b14f2452513b432ebae9
  files_scanned=5027 issue_count_total=3230

Also:
- Fixes STANDARDS_REGISTRY.json duplicate '"total_standards"' line
  that was making tools/standards_check.py --list/--verify crash
- Appends consent log entry per SOP-AI-HANDSHAKE-1.0

Per .cursorrules / CLAUDE.md: no float, Fraction classified ratio,
every check function carries Falsifies if: + falsifies_if: doc pair,
Tuple[bool, ProofObject]-compatible types, no stubs.

Co-Authored-By: Tony Ha <aidoruao@gmail.com>
…, drop dead IssueHit, fix namespace sum, segregate non-deterministic metadata

Review feedback on PR #141:

1. 🔴 assert statement in tools/generate_hashed_taxonomy.py:282 violated
   .cursorrules / CLAUDE.md 'No assert' rule. Replaced with a single
   walrus-style assignment that also removes the redundant double regex
   search the other reviewer flagged.
2. 🔴 _RE_ASSERT regex was defined but no scanner emitted the issue
   type. Renamed to ASSERT_USE (the rule applies to all modules, not
   just check_* surface), gated on is_python to avoid false positives
   in prose, and wired into _scan_line_level. 8012 real assert hits
   surfaced across Python sources.
3. 📝 Removed the dead IssueHit dataclass + its dataclass import.
4. 📝 Added 'unclassified' to counts_by_namespace so per-namespace
   counts can sum to at least issue_count_total (they previously silently
   dropped unclassified hits).
5. 📝 Moved generated_at_utc / jsonl_path into a dedicated 'metadata'
   subkey with an explicit 'not_covered_by_audit_sha256' list, and
   dropped the non-portable repo_root field entirely.

Tests: 11 pass (added test_line_level_scanner_skips_python_only_patterns_for_non_python,
test_namespace_counts_account_for_unclassified,
test_gap_analysis_metadata_is_outside_commitment).

Regenerated artifacts: audit_sha256 = 34ed7b25249c045c8274fe2b969986bc5a62839791581a59b86fb74bd5e5e3dd
(deterministic across two runs; files_scanned=5028, issue_count_total=11027).

Co-Authored-By: Tony Ha <aidoruao@gmail.com>
…docstring, narrow projection keywords

- _scan_check_function: truncate next-40-line window at the next def/class
  so an adjacent check_* cannot satisfy the current one's contract
- _RE_CHECK_DEF: anchor at line start with [ \t]* so match does not consume
  a preceding newline; line_no now points at the actual def line
- _RE_FALS_TITLE: drop re.IGNORECASE — title-case 'Falsifies if:' is
  mandatory per .cursorrules / CLAUDE.md / .windsurfrules
- _write_jsonl: update docstring to describe the true 4-tuple sort key
  (path, line, issue_type, entry_sha256)
- NAMESPACE_KEYWORDS.projection: drop bare 'projection' / 'mirror' in favor
  of compound keys (projected_namespace, projected_view, projected_domain,
  namespace_projection, mirror_namespace, derivative_witness) to stop
  over-classifying common English uses of the word

Tests added:
- test_check_function_window_does_not_bleed_into_adjacent_def
- test_falsifies_if_title_case_is_strictly_enforced
- test_projection_namespace_keywords_are_narrow

Audit artifacts regenerated. New audit_sha256 (deterministic across two runs):
  be134e9c1867d804eb5708ddb0058281f8203818549730b46634901e7ae0754c
issue_count_total 11027 -> 11082 (bleed + title-case + line_no fixes surface
55 additional real findings).

Co-Authored-By: Tony Ha <aidoruao@gmail.com>
…ANNOT return-type coverage

Devin Review round 3 findings on PR #141 (10 total comments):

BUG fixes (5):
- Add 'Falsifies if:' / 'falsifies_if:' docstring pair to _scan_line_level
- Add pair to _scan_check_function (with window-bleed invariant note)
- Add pair to build_entries (determinism + summary/entries agreement)
- Add pair to _write_summary (audit_sha256 determinism + metadata boundary)
- Add pair to main (clean-walk exit code + cross-run determinism)

FLAG fix (1):
- Broaden _RE_FLOAT_ANNOT from r':\s*float\b' to r'(?::|->)\s*float\b' so
  return-type annotations 'def f() -> float' are flagged alongside
  parameter/variable annotations. Add
  test_float_annot_regex_catches_return_type_annotations to lock the
  coverage invariant.

Cleanup (from ANALYSIS comment):
- Remove unused 'path' parameter from _scan_check_function; update the
  single internal caller and three call sites in tests. Pure dead-code
  removal; no behavior change.

Audit artifact regeneration:
- audit_sha256 deterministic across two consecutive runs:
  f73f70dc8ae70d990a47aba215ab6ed49d165ac0d8dc4cfe734048a1c5a45eb2
- issue_count_total: 11,082 -> 11,275 (+193 newly-detected '-> float'
  return-type annotations now covered by FLOAT_ANNOT).

Verification:
- pytest tests/test_hashed_taxonomy.py: 15/15 passing (was 14, +1 new).
- CS-003 (Falsifies-if title-case pair) passes on standards_check.
- _RE_FALS_TITLE remains case-sensitive (no IGNORECASE) per .cursorrules.

Co-Authored-By: Tony Ha <aidoruao@gmail.com>
@devin-ai-integration devin-ai-integration Bot force-pushed the devin/1776650369-stage-a-hashed-taxonomy branch from 3aab8b1 to b0e2a8a Compare April 20, 2026 17:10
Copy link
Copy Markdown
Contributor Author

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 8 new potential issues.

Open in Devin Review

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Existing test files don't follow Falsifies if: / falsifies_if: convention either

While the rule files mandate both Falsifies if: (title-case) and falsifies_if: (lowercase) in ALL docstrings, the established test convention in this repo does not follow this — tests/test_axioms.py has zero instances of either form across 30+ test functions, and tests/test_timeline_html.py similarly omits them. The new test file tests/test_hashed_taxonomy.py actually does better than existing tests, including the pair in 8 of its 14 test docstrings. The missing pairs in the remaining 6 test functions are consistent with (and better than) existing repo convention, which is why they were not flagged as bugs.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +483 to +491
frac_classified = Fraction(classified, total)
# ``metadata`` collects fields that are *intentionally* not covered by
# ``audit_sha256`` (timestamps, output paths). Separating them makes the
# commitment boundary visible to reviewers: only ``summary`` +
# ``ordered_entry_hashes`` participate in the commit payload.
doc = {
"schema": "OE-GAP-ANALYSIS-1.0",
"summary": summary,
"classified_fraction": f"{frac_classified.numerator}/{frac_classified.denominator}",
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Fraction auto-reduction makes classified_fraction misrepresent actual counts

Fraction(classified, total) auto-reduces to lowest terms, so the string f"{frac_classified.numerator}/{frac_classified.denominator}" can produce misleading output. For example, if 10 of 20 entries are classified, the output is "1/2" instead of "10/20", while entry_count is 20. A reader would expect the denominator to match entry_count. The current run's output ("7158/11275") is coincidentally correct because GCD(7158, 11275)=1, but future runs with reducible counts will produce a fraction whose denominator doesn't match the adjacent entry_count field.

Suggested change
frac_classified = Fraction(classified, total)
# ``metadata`` collects fields that are *intentionally* not covered by
# ``audit_sha256`` (timestamps, output paths). Separating them makes the
# commitment boundary visible to reviewers: only ``summary`` +
# ``ordered_entry_hashes`` participate in the commit payload.
doc = {
"schema": "OE-GAP-ANALYSIS-1.0",
"summary": summary,
"classified_fraction": f"{frac_classified.numerator}/{frac_classified.denominator}",
frac_classified = Fraction(classified, total)
# ``metadata`` collects fields that are *intentionally* not covered by
# ``audit_sha256`` (timestamps, output paths). Separating them makes the
# commitment boundary visible to reviewers: only ``summary`` +
# ``ordered_entry_hashes`` participate in the commit payload.
doc = {
"schema": "OE-GAP-ANALYSIS-1.0",
"summary": summary,
"classified_fraction": f"{classified}/{total}",
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.



def _canonical_json(obj: Any) -> str:
"""Return a deterministic canonical JSON serialisation (sorted keys, no BOM)."""
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Docstring for _canonical_json missing mandatory Falsifies if: / falsifies_if: pair

Per .cursorrules, .windsurfrules, and CLAUDE.md, all docstrings must include both Falsifies if: (title-case) and falsifies_if: (lowercase). The _canonical_json function at tools/generate_hashed_taxonomy.py:218 has a docstring but is missing both required forms. Other functions in the same file (_sha256_text, _sha256_bytes, _scan_line_level, build_entries, etc.) correctly include the pair, making this an inconsistency within the file itself.

Suggested change
"""Return a deterministic canonical JSON serialisation (sorted keys, no BOM)."""
"""Return a deterministic canonical JSON serialisation (sorted keys, no BOM).
Falsifies if: the output is not identical for two dicts with the same key-value pairs.
falsifies_if: the output is not identical for two dicts with the same key-value pairs.
"""
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +439 to +444
"""Write entries as JSONL, deterministically ordered by the 4-tuple
``(path, line, issue_type, entry_sha256)``.

The ``entry_sha256`` tail of the sort key ensures a stable total order
when two findings share the same (path, line, issue_type) coordinates.
"""
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Docstring for _write_jsonl missing mandatory Falsifies if: / falsifies_if: pair

Per .cursorrules, .windsurfrules, and CLAUDE.md, all docstrings must include both Falsifies if: (title-case) and falsifies_if: (lowercase). The _write_jsonl function at tools/generate_hashed_taxonomy.py:439-444 has a multi-line docstring but is missing both required forms. Sibling functions like _write_summary and build_entries in the same file correctly include the pair.

Suggested change
"""Write entries as JSONL, deterministically ordered by the 4-tuple
``(path, line, issue_type, entry_sha256)``.
The ``entry_sha256`` tail of the sort key ensures a stable total order
when two findings share the same (path, line, issue_type) coordinates.
"""
"""Write entries as JSONL, deterministically ordered by the 4-tuple
``(path, line, issue_type, entry_sha256)``.
The ``entry_sha256`` tail of the sort key ensures a stable total order
when two findings share the same (path, line, issue_type) coordinates.
Falsifies if: the output file order differs across two calls with the same entries.
falsifies_if: the output file order differs across two calls with the same entries.
"""
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.



def _namespaces_for_text(text_lower: str, path_lower: str) -> Tuple[str, ...]:
"""Return the sorted tuple of namespace labels matching ``text_lower``/``path_lower``."""
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Docstring for _namespaces_for_text missing mandatory Falsifies if: / falsifies_if: pair

Per .cursorrules, .windsurfrules, and CLAUDE.md, all docstrings must include both Falsifies if: (title-case) and falsifies_if: (lowercase). The _namespaces_for_text function at tools/generate_hashed_taxonomy.py:260 has a docstring but is missing both required forms. This function is part of the core classification logic and is exercised directly in tests, making the missing falsification contract more impactful than for pure utility functions.

Suggested change
"""Return the sorted tuple of namespace labels matching ``text_lower``/``path_lower``."""
"""Return the sorted tuple of namespace labels matching ``text_lower``/``path_lower``.
Falsifies if: a keyword present in the text or path does not yield the expected namespace.
falsifies_if: a keyword present in the text or path does not yield the expected namespace.
"""
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

"reverse_math",
"surreal",
"transfinite",
"fraction(",
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Namespace over-classification via fraction( keyword

The math_popperian namespace includes "fraction(" as a keyword (tools/generate_hashed_taxonomy.py:95). Since .cursorrules and .windsurfrules mandate Fraction everywhere instead of float, virtually every Python file in the repo will contain Fraction( which, after .lower(), matches "fraction(". This explains the very high math_popperian count of 4825 in the committed audits/gap_analysis_20260420.json:19. The namespace classification is technically correct but extremely broad — nearly every Python file gets tagged as math_popperian, reducing the signal value of that namespace label.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

# Matches both variable/parameter annotations (``x: float``) and return-type
# annotations (``def f() -> float``) so the rule "no float anywhere" has full
# coverage rather than only catching the argument side.
_RE_FLOAT_ANNOT = re.compile(r"(?::|->)\s*float\b")
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: _RE_FLOAT_ANNOT regex doesn't catch generic float annotations like List[float]

The regex (?::|->)\s*float\b at tools/generate_hashed_taxonomy.py:166 only catches annotations where float immediately follows : or -> (modulo whitespace). Annotations like x: List[float], x: Optional[float], or x: dict[str, float] would not be flagged. The comment on lines 163-165 documents the intent as covering "variable/parameter annotations and return-type annotations", so this gap appears to be a known scope limitation rather than an oversight. If the goal is truly "no float anywhere" enforcement, this would need a broader pattern like \bfloat\b with context-aware filtering.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +468 to +502
out_path.parent.mkdir(parents=True, exist_ok=True)
ordered_hashes = sorted(e["entry_sha256"] for e in entries)
commit_payload = {
"summary": summary,
"ordered_entry_hashes": ordered_hashes,
}
audit_sha256 = _sha256_text(_canonical_json(commit_payload))
try:
jsonl_label = jsonl_path.resolve().relative_to(REPO_ROOT).as_posix()
except ValueError:
jsonl_label = jsonl_path.name

# Percent of entries classified — as Fraction, rendered as "num/den".
total = max(len(entries), 1)
classified = sum(1 for e in entries if e["namespaces"] != ["unclassified"])
frac_classified = Fraction(classified, total)
# ``metadata`` collects fields that are *intentionally* not covered by
# ``audit_sha256`` (timestamps, output paths). Separating them makes the
# commitment boundary visible to reviewers: only ``summary`` +
# ``ordered_entry_hashes`` participate in the commit payload.
doc = {
"schema": "OE-GAP-ANALYSIS-1.0",
"summary": summary,
"classified_fraction": f"{frac_classified.numerator}/{frac_classified.denominator}",
"audit_sha256": audit_sha256,
"entry_count": len(entries),
"metadata": {
"generated_at_utc": datetime.now(tz=timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ"),
"jsonl_path": jsonl_label,
"not_covered_by_audit_sha256": ["generated_at_utc", "jsonl_path"],
},
}
with out_path.open("w", encoding="utf-8") as fh:
fh.write(json.dumps(doc, sort_keys=True, indent=2, ensure_ascii=True) + "\n")
return audit_sha256
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Determinism invariant is well-designed with clear commitment boundary

The audit_sha256 commitment at tools/generate_hashed_taxonomy.py:470-474 correctly separates deterministic content (summary + ordered entry hashes) from non-deterministic metadata (timestamps, output paths). The metadata is explicitly documented via not_covered_by_audit_sha256 in the output JSON. The _iter_candidate_files uses sorted(root.rglob("*")) for deterministic ordering, and _write_jsonl sorts by a 4-tuple including entry_sha256 as a tiebreaker. This design means two runs over an identical working tree will produce identical audit_sha256 values, satisfying the tool's core invariant.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

Devin is currently unreachable - the session may have died.

3 similar comments
@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

Devin is currently unreachable - the session may have died.

@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

Devin is currently unreachable - the session may have died.

@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

Devin is currently unreachable - the session may have died.

@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

SUPERSEDED by PR #151 — This PR's changes are included in the consolidated merge PR #151 (CI 31/31 green, ready to merge). Close this PR after #151 is merged.

@aidoruao aidoruao merged commit 1ee654a into main Apr 20, 2026
32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant