Progressively start acting on structured output from an LLM as it streams.
jsontap lets you await fields and iterate array item as soon as they appear – without waiting for full JSON completion. Overlap model generation with execution: dispatch tool calls earlier, update interfaces sooner, and cut end-to-end latency.
Built on ijson, it gives you awaitable, path-oriented access to streaming JSON so you write sequential-looking code.
A complete example
pip install jsontapOr with uv:
uv add jsontapMost agent frameworks handle structured JSON output in one of a few ways:
- Buffer and parse – wait for the full streamed response, then call
json.loads. Simple, but you're leaving tool execution latency sitting idle during generation. - One tool call at a time – prompt the model to return a single tool call, execute it, then go back for the next. Clean, but fully sequential. You lose all parallelism.
- Newline-delimited hacks – instruct the model to output one JSON object per line and split on
\n. Brittle, prompt-dependent, and breaks with any formatting variation.
jsontap eliminates these workarounds with a clean async API built around a streaming JSON parser ijson.
With jsontap, you write code that reads top-to-bottom, like you're accessing a fully parsed dict – except each line resolves as the data arrives:
# chat_completion() must be an async generator that yields chat completion chunks
response = jsontap(chat_completion())
reasoning = await response["reasoning"]
print(f"{reasoning}")
async for call in response["calls"]:
tool = await call["tool"]
args = await call["args"]
asyncio.create_task(dispatch(tool, args))
summary = await response["summary"]
print(f"{summary}")This looks like code you'd write against a fully parsed dict – but it isn't. Each await and async for resolves as the corresponding part of the JSON arrives in the stream. reasoning unblocks the moment that field is parsed, the calls loop starts iterating before the array is complete, and summary waits only as long as it needs to.
In practice, this means you can add streaming responsiveness to an agent without restructuring your code. If you already have logic that reads from a parsed JSON dict, the jsontap version looks almost identical.
jsontap supports several ways to consume nodes depending on what you need:
# Await a scalar field
intent = await response["intent"]
# Await a nested value
city = await response["user"]["address"]["city"]
# Async-iterate array items as handles (access sub-fields per item)
async for item in response["results"]:
score = await item["score"]
print(score)
# Async-iterate array as fully materialized values
async for result in response["results"].values():
process(result)
# Await the full array at once
results = await response["results"]Nodes are created lazily and can be subscribed to before their part of the JSON has been parsed. Multiple consumers can await or iterate the same node – each gets the full result.
| Situation | Behavior |
|---|---|
| Malformed JSON or source exception | All pending awaits and async for loops receive the exception immediately |
| Key not present in parsed JSON | KeyError raised once parsing is complete |
.value accessed before node resolves |
LookupError |
for ... in node before stream finishes |
RuntimeError |
feed() called after finish() |
RuntimeError |
| Source type | Behavior | Returns |
|---|---|---|
| Async iterable | Background parsing task starts immediately | AsyncJsonNode |
| Sync iterable | Consumed eagerly, values resolved before access | AsyncJsonNode |
| None | Manual chunk feeding mode | (root, feed, finish) |
| Pattern | Description |
|---|---|
node["key"] |
Get or create a child node |
await node |
Block until value is resolved |
async for item in node |
Stream array item handles as each slot is parsed |
async for value in node.values() |
Stream fully materialized array values |
node.value |
Synchronous access to a resolved value |
for item in node |
Synchronous iteration over a completed array |
node.resolved |
True if the node's value has been parsed |
