Problem π―
Data flows through the system as bare primitives β dict, list[dict], str, int β long after crossing I/O boundaries where it could be parsed into meaningful domain types. Examples:
- GitHub issue data:
open_autonomous_issues(), open_issues(), post_issues(), issue_context(), _apply_verdict() all pass dict / list[dict]. The actual shape (which keys exist, their types) is implicit β determined by --json field lists passed to gh and by conventions in the scan/groom loops.
- Agent step output:
step() returns dict. The JSON schema is defined in prompts but invisible to the type system. Callers access raw["findings"], clustered["clusters"], reviewed["ready"] etc. with no static checking.
- Project/scan config:
load_project() returns dict, scan blocks are dict, fields like project["repo"] and scan["type"] are untyped strings.
- Identifiers and paths: project IDs, scan types, issue numbers, branch names, labels β all bare
str / int with no domain wrapper to distinguish them or constrain their values.
- Run metadata: the
metadata dict built in scan/fix/groom finales is assembled ad-hoc with string keys.
In practice this mostly works β the codebase is small and conventions are consistent. But it means:
- Key typos and shape mismatches are silent at type-check time
- You have to trace through I/O call sites to know what a value actually contains
- Functions that need different subsets of fields share the same
dict type
- Domain rules (e.g. "a scan type is one of these known strings") live in runtime checks or conventions, not in the type system
Definition of done β
- I/O boundaries (GitHub API responses, agent step JSON output, project config loading, run metadata) parse into explicit domain types on entry β you know this is fixed when
ty check catches a misspelled key, missing field, or wrong type at any call site
- Internal code works with validated, meaningfully labeled data β not raw dicts and strings
- The approach is chosen deliberately between TypedDicts, dataclasses, Pydantic, NewType, enums, etc. based on the tradeoffs at each boundary: JSON passthrough vs validated parsing vs runtime construction vs lightweight semantic tagging
Out of scope β
- Typing the
gh() wrapper itself (it returns CompletedProcess, which is fine)
- Typing third-party library internals
- Changing the JSON schemas in agent prompts β this is about parsing what already exists, not redesigning the agent protocol
Problem π―
Data flows through the system as bare primitives β
dict,list[dict],str,intβ long after crossing I/O boundaries where it could be parsed into meaningful domain types. Examples:open_autonomous_issues(),open_issues(),post_issues(),issue_context(),_apply_verdict()all passdict/list[dict]. The actual shape (which keys exist, their types) is implicit β determined by--jsonfield lists passed toghand by conventions in the scan/groom loops.step()returnsdict. The JSON schema is defined in prompts but invisible to the type system. Callers accessraw["findings"],clustered["clusters"],reviewed["ready"]etc. with no static checking.load_project()returnsdict, scan blocks aredict, fields likeproject["repo"]andscan["type"]are untyped strings.str/intwith no domain wrapper to distinguish them or constrain their values.metadatadict built in scan/fix/groom finales is assembled ad-hoc with string keys.In practice this mostly works β the codebase is small and conventions are consistent. But it means:
dicttypeDefinition of done β
ty checkcatches a misspelled key, missing field, or wrong type at any call siteOut of scope β
gh()wrapper itself (it returnsCompletedProcess, which is fine)