Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file modified .coverage
Binary file not shown.
5 changes: 3 additions & 2 deletions .github/workflows/lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,10 +37,11 @@ jobs:
uses: actions/setup-python@v6
with:
python-version: '3.13'
- name: Install uv
uses: astral-sh/setup-uv@v6
- name: Install the latest version of uv
uses: astral-sh/setup-uv@cec208311dfd045dd5311c1add060b2062131d57 # v8.0.0
with:
enable-cache: true
version: "latest"
- name: Install dependencies
run: |
uv venv
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/pythonpackage.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: [pypy-3.10, pypy-3.11, '3.10', '3.11', '3.12', '3.13', '3.14', '3.14t', '3.15.0-alpha.3']
python-version: [pypy-3.10, pypy-3.11, '3.10', '3.11', '3.12', '3.13', '3.14', '3.14t', '3.15.0-alpha.7']
os: [
ubuntu-latest,
windows-latest,
Expand Down
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -135,3 +135,6 @@ rust/target/
Cargo.lock
*.rlib
*.rmeta
.codex
.pi
.cursor
105 changes: 105 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
# Before starting work

- Run `lat search` to find sections relevant to your task. Read them to understand the design intent before writing code.
- Run `lat expand` on user prompts to expand any `[[refs]]` — this resolves section names to file locations and provides context.

# Post-task checklist (REQUIRED — do not skip)

After EVERY task, before responding to the user:

- [ ] Update `lat.md/` if you added or changed any functionality, architecture, tests, or behavior
- [ ] Run `lat check` — all wiki links and code refs must pass
- [ ] Do not skip these steps. Do not consider your task done until both are complete.

---

# What is lat.md?

This project uses [lat.md](https://www.npmjs.com/package/lat.md) to maintain a structured knowledge graph of its architecture, design decisions, and test specs in the `lat.md/` directory. It is a set of cross-linked markdown files that describe **what** this project does and **why** — the domain concepts, key design decisions, business logic, and test specifications. Use it to ground your work in the actual architecture rather than guessing.

# Commands

```bash
lat locate "Section Name" # find a section by name (exact, fuzzy)
lat refs "file#Section" # find what references a section
lat search "natural language" # semantic search across all sections
lat expand "user prompt text" # expand [[refs]] to resolved locations
lat check # validate all links and code refs
```

Run `lat --help` when in doubt about available commands or options.

If `lat search` fails because no API key is configured, explain to the user that semantic search requires a key provided via `LAT_LLM_KEY` (direct value), `LAT_LLM_KEY_FILE` (path to key file), or `LAT_LLM_KEY_HELPER` (command that prints the key). Supported key prefixes: `sk-...` (OpenAI) or `vck_...` (Vercel). If the user doesn't want to set it up, use `lat locate` for direct lookups instead.

# Syntax primer

- **Section ids**: `lat.md/path/to/file#Heading#SubHeading` — full form uses project-root-relative path (e.g. `lat.md/tests/search#RAG Replay Tests`). Short form uses bare file name when unique (e.g. `search#RAG Replay Tests`, `cli#search#Indexing`).
- **Wiki links**: `[[target]]` or `[[target|alias]]` — cross-references between sections. Can also reference source code: `[[src/foo.ts#myFunction]]`.
- **Source code links**: Wiki links in `lat.md/` files can reference functions, classes, constants, and methods in TypeScript/JavaScript/Python/Rust/Go/C files. Use the full path: `[[src/config.ts#getConfigDir]]`, `[[src/server.ts#App#listen]]` (class method), `[[lib/utils.py#parse_args]]`, `[[src/lib.rs#Greeter#greet]]` (Rust impl method), `[[src/app.go#Greeter#Greet]]` (Go method), `[[src/app.h#Greeter]]` (C struct). `lat check` validates these exist.
- **Code refs**: `// @lat: [[section-id]]` (JS/TS/Rust/Go/C) or `# @lat: [[section-id]]` (Python) — ties source code to concepts

# Test specs

Key tests can be described as sections in `lat.md/` files (e.g. `tests.md`). Add frontmatter to require that every leaf section is referenced by a `// @lat:` or `# @lat:` comment in test code:

```markdown
---
lat:
require-code-mention: true
---
# Tests

Authentication and authorization test specifications.

## User login

Verify credential validation and error handling for the login endpoint.

### Rejects expired tokens
Tokens past their expiry timestamp are rejected with 401, even if otherwise valid.

### Handles missing password
Login request without a password field returns 400 with a descriptive error.
```

Every section MUST have a description — at least one sentence explaining what the test verifies and why. Empty sections with just a heading are not acceptable. (This is a specific case of the general leading paragraph rule below.)

Each test in code should reference its spec with exactly one comment placed next to the relevant test — not at the top of the file:

```python
# @lat: [[tests#User login#Rejects expired tokens]]
def test_rejects_expired_tokens():
...

# @lat: [[tests#User login#Handles missing password]]
def test_handles_missing_password():
...
```

Do not duplicate refs. One `@lat:` comment per spec section, placed at the test that covers it. `lat check` will flag any spec section not covered by a code reference, and any code reference pointing to a nonexistent section.

# Section structure

Every section in `lat.md/` **must** have a leading paragraph — at least one sentence immediately after the heading, before any child headings or other block content. The first paragraph must be ≤250 characters (excluding `[[wiki link]]` content). This paragraph serves as the section's overview and is used in search results, command output, and RAG context — keeping it concise guarantees the section's essence is always captured.

```markdown
# Good Section

Brief overview of what this section documents and why it matters.

More detail can go in subsequent paragraphs, code blocks, or lists.

## Child heading

Details about this child topic.
```

```markdown
# Bad Section

## Child heading

Details about this child topic.
```

The second example is invalid because `Bad Section` has no leading paragraph. `lat check` validates this rule and reports errors for missing or overly long leading paragraphs.
2 changes: 2 additions & 0 deletions json2xml/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@
EMAIL = "mail@vinitkumar.me"


# @lat: [[architecture#CLI entrypoint]]
def create_parser() -> argparse.ArgumentParser:
"""Create and configure the argument parser."""
parser = argparse.ArgumentParser(
Expand Down Expand Up @@ -228,6 +229,7 @@ def create_parser() -> argparse.ArgumentParser:
return parser


# @lat: [[behavior#Input readers]]
def read_input(args: argparse.Namespace) -> dict[str, Any] | list[Any]:
"""
Read JSON input from the specified source.
Expand Down
6 changes: 4 additions & 2 deletions json2xml/dicttoxml.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ def get_unique_id(element: str) -> str:
]


def get_xml_type(val: ELEMENT) -> str:
def get_xml_type(val: Any) -> str:
"""
Get the XML type of a given value.

Expand Down Expand Up @@ -227,6 +227,7 @@ def get_xpath31_tag_name(val: Any) -> str:
return "string"


# @lat: [[behavior#XPath 3.1 format]]
def convert_to_xpath31(obj: Any, parent_key: str | None = None) -> str:
"""
Convert a Python object to XPath 3.1 json-to-xml format.
Expand Down Expand Up @@ -267,7 +268,7 @@ def convert_to_xpath31(obj: Any, parent_key: str | None = None) -> str:


def convert(
obj: ELEMENT,
obj: Any,
ids: Any,
attr_type: bool,
item_func: Callable[[str], str],
Expand Down Expand Up @@ -631,6 +632,7 @@ def convert_none(
return f"<{key}{attr_string}></{key}>"


# @lat: [[architecture#Conversion engine]]
def dicttoxml(
obj: ELEMENT,
root: bool = True,
Expand Down
15 changes: 9 additions & 6 deletions json2xml/dicttoxml_fast.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,22 +17,24 @@
from collections.abc import Callable
from typing import Any

RustStringTransform = Callable[[str], str]

LOG = logging.getLogger("dicttoxml_fast")

# Try to import the Rust implementation
_USE_RUST = False
_rust_dicttoxml = None
_rust_dicttoxml: Callable[..., bytes] | None = None
rust_escape_xml: RustStringTransform | None = None
rust_wrap_cdata: RustStringTransform | None = None

try:
from json2xml_rs import dicttoxml as _rust_dicttoxml # type: ignore[import-not-found] # pragma: no cover
from json2xml_rs import escape_xml_py as rust_escape_xml # type: ignore[import-not-found] # pragma: no cover
from json2xml_rs import wrap_cdata_py as rust_wrap_cdata # type: ignore[import-not-found] # pragma: no cover
from json2xml_rs import dicttoxml as _rust_dicttoxml # pragma: no cover
from json2xml_rs import escape_xml_py as rust_escape_xml # pragma: no cover
from json2xml_rs import wrap_cdata_py as rust_wrap_cdata # pragma: no cover
_USE_RUST = True # pragma: no cover
LOG.debug("Using Rust backend for dicttoxml") # pragma: no cover
except ImportError: # pragma: no cover
LOG.debug("Rust backend not available, using pure Python")
rust_escape_xml = None
rust_wrap_cdata = None

# Import the pure Python implementation as fallback
from json2xml import dicttoxml as _py_dicttoxml # noqa: E402
Expand All @@ -48,6 +50,7 @@ def get_backend() -> str:
return "rust" if _USE_RUST else "python"


# @lat: [[architecture#Backend selection]]
def dicttoxml(
obj: Any,
root: bool = True,
Expand Down
3 changes: 3 additions & 0 deletions json2xml/json2xml.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
from .utils import InvalidDataError


# @lat: [[architecture#Core pipeline]]
class Json2xml:
"""
Wrapper class to convert the data to xml
Expand All @@ -34,6 +35,8 @@ def __init__(
self.cdata = cdata
self.list_headers = list_headers

# @lat: [[behavior#Conversion output]]
# @lat: [[behavior#Invalid XML payloads]]
def to_xml(self) -> Any | None:
"""
Convert to xml using dicttoxml.dicttoxml and then pretty print it.
Expand Down
3 changes: 2 additions & 1 deletion json2xml/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ class StringReadError(Exception):
pass


# @lat: [[behavior#Input readers]]
def readfromjson(filename: str) -> dict[str, str]:
"""Reads a JSON file and returns a dictionary."""
try:
Expand All @@ -44,7 +45,7 @@ def readfromurl(url: str, params: dict[str, str] | None = None) -> dict[str, str
raise URLReadError("URL is not returning correct response")


def readfromstring(jsondata: str) -> dict[str, str]:
def readfromstring(jsondata: object) -> dict[str, str]:
"""Loads JSON data from a string and returns a dictionary."""
if not isinstance(jsondata, str):
raise StringReadError("Input is not a proper JSON string")
Expand Down
18 changes: 18 additions & 0 deletions json2xml_rs.pyi
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
from typing import Any


def dicttoxml(
obj: Any,
root: bool = True,
custom_root: str = "root",
attr_type: bool = True,
item_wrap: bool = True,
cdata: bool = False,
list_headers: bool = False,
) -> bytes: ...


def escape_xml_py(s: str) -> str: ...


def wrap_cdata_py(s: str) -> str: ...
27 changes: 27 additions & 0 deletions lat.md/architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Architecture

This file documents the main execution paths that turn JSON input into XML output across the library, CLI, and optional Rust accelerator.

## Core pipeline

The standard pipeline reads JSON into Python objects, passes that data through [[json2xml/json2xml.py#Json2xml]], and delegates serialization to [[json2xml/dicttoxml.py#dicttoxml]].

Library callers usually construct [[json2xml/json2xml.py#Json2xml]] with a decoded `dict` or `list`. CLI callers reach the same conversion path through [[json2xml/cli.py#read_input]], which resolves the input source before creating the converter. Pretty output is produced by reparsing the generated XML so callers get indented text when requested.

## Conversion engine

The pure Python serializer recursively maps Python values to XML elements, attributes, and text while preserving the project-specific options around wrappers, list handling, and type metadata.

[[json2xml/dicttoxml.py#dicttoxml]] is the public serializer. It handles the XML declaration, root wrapper, namespace emission, XPath mode, and then routes nested values through helper functions such as [[json2xml/dicttoxml.py#convert]], [[json2xml/dicttoxml.py#convert_dict]], and [[json2xml/dicttoxml.py#convert_list]]. [[json2xml/dicttoxml.py#get_xml_type]] and [[json2xml/dicttoxml.py#convert]] accept broad caller input and classify unsupported values at runtime, so tests can probe failure paths without lying to the type checker. Invalid XML names are normalized by [[json2xml/dicttoxml.py#make_valid_xml_name]] instead of crashing immediately on user keys.

## Backend selection

The fast-path module prefers the Rust extension when it can preserve Python semantics, and falls back to the Python serializer for unsupported features.

[[json2xml/dicttoxml_fast.py#dicttoxml]] uses the Rust backend only when optional features such as `ids`, custom `item_func`, XML namespaces, XPath mode, or special `@` keys are not involved. A local stub for the optional `json2xml_rs` module keeps static analysis aligned with that fallback design, so type checking still passes when the extension is not installed. This keeps fast installs fast without letting the optimized path silently change behavior.

## CLI entrypoint

The CLI is a thin adapter that parses options, resolves one input source, and forwards those options into the same converter used by the library API.

[[json2xml/cli.py#create_parser]] defines the user-facing flags. [[json2xml/cli.py#read_input]] enforces the source priority rules, and [[json2xml/cli.py#main]] constructs [[json2xml/json2xml.py#Json2xml]] so command-line use and library use stay aligned.
27 changes: 27 additions & 0 deletions lat.md/behavior.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Behavior

This file captures the observable conversion and input rules that matter more than the implementation details hiding underneath.

## Input readers

The input helpers convert files, strings, URLs, and stdin into Python data structures while surfacing source-specific errors to callers.

[[json2xml/utils.py#readfromjson]] wraps file and JSON decoding failures in `JSONReadError`. [[json2xml/utils.py#readfromstring]] accepts unknown caller input so invalid-type tests can call it honestly, then rejects non-string inputs and malformed JSON with `StringReadError`. [[json2xml/utils.py#readfromurl]] performs a GET request and raises `URLReadError` when the HTTP status is not `200`.

## Conversion output

Default output includes an XML declaration, wraps content in `all`, pretty prints the document, and annotates elements with their source type unless callers disable those features.

[[json2xml/json2xml.py#Json2xml#to_xml]] calls [[json2xml/dicttoxml.py#dicttoxml]] with the configured wrapper, root, `attr_type`, `item_wrap`, `cdata`, and `list_headers` options. When `item_wrap=False`, list values repeat the parent tag instead of creating `<item>` children. When `pretty=False`, the library returns the serializer bytes directly.

## XPath 3.1 format

XPath mode swaps the project-specific XML shape for the W3C `json-to-xml` mapping with typed element names and the XPath functions namespace.

When `xpath_format=True`, [[json2xml/dicttoxml.py#dicttoxml]] delegates payload conversion to [[json2xml/dicttoxml.py#convert_to_xpath31]] and emits the `http://www.w3.org/2005/xpath-functions` namespace on the root `map` or `array` element. Scalars become `string`, `number`, `boolean`, or `null` elements, and object keys move into `key` attributes.

## Invalid XML payloads

Pretty printing acts as a validation step, because the formatter reparses the generated XML before returning it.

[[json2xml/json2xml.py#Json2xml#to_xml]] uses `defusedxml.minidom.parseString` before `toprettyxml`. If the generated bytes are not well-formed XML, the converter raises `InvalidDataError` instead of returning broken pretty output.
23 changes: 23 additions & 0 deletions lat.md/lat.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# json2xml knowledge graph

This directory records the project's architecture, behavior rules, and anchored tests so code and intent stop drifting apart in the dark.

## Documentation map

Start here to find the major concepts and the code they describe.

- [[architecture]] describes the library, CLI, and backend-selection paths.
- [[behavior]] captures conversion rules, input handling, and XPath mode.
- [[tests]] anchors a first set of regression-critical tests to documented behavior.

## Semantic search setup

Semantic search depends on an LLM key, so local `lat search` is unavailable until the environment is configured.

Set one of `LAT_LLM_KEY`, `LAT_LLM_KEY_FILE`, or `LAT_LLM_KEY_HELPER` before relying on semantic search. Without that key, direct lookups such as `lat locate` and structural validation through `lat check` still work.

## Repository hygiene

Local agent and editor directories are treated as machine-specific workspace state, not project knowledge or source.

The repository ignores `.codex`, `.pi`, and `.cursor` so local agent tooling does not pollute diffs or become part of the documented code surface. Keep durable design notes in `lat.md/` instead of those scratch directories.
31 changes: 31 additions & 0 deletions lat.md/tests.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
---
lat:
require-code-mention: true
---
# Tests

This file defines a small, high-signal test slice that anchors the initial lat.md setup to behavior the project should keep stable.

## CLI input resolution

These tests lock down how the CLI chooses among competing input sources so callers get deterministic behavior instead of surprise precedence games.

### URL input takes priority

When both URL and string inputs are present, the CLI should read from the URL first so the documented source precedence remains stable.

### Dash argument reads stdin

When the positional input is `-`, the CLI should read stdin instead of trying to open a file literally named `-`.

## Conversion behavior

These tests pin the XML shapes that matter most for interoperability, especially the modes that intentionally diverge from the default serializer.

### XPath format adds functions namespace

XPath mode should emit the W3C XPath functions namespace and typed child elements so downstream consumers receive standards-shaped XML.

### Item-wrap false repeats parent tag

Disabling item wrapping should repeat the parent element name for primitive list items instead of producing nested `<item>` tags.
Loading
Loading