laconic

A tiny Go library for building research agents that stay fast and cheap on small-context models.

Why?

Most "ReAct" agents keep appending raw traces until the prompt overflows. Laconic instead compresses state at every step — either via a rolling scratchpad or a notebook of atomic facts — making it practical on 4k/8k models and low-cost backends.

Example output

Here's a research question answered by qwen3:4b (a 4-billion-parameter model) using the graph-reader strategy. Without research, the model cannot answer this — it responds "the 2024 Nobel Prize in Chemistry has not been announced." With the agent, it autonomously searches, extracts atomic facts, and synthesizes this:

Prompt: Who won the 2024 Nobel Prize in Chemistry, what specific contribution were they recognized for, and what institution or company are they affiliated with?

Answer: The 2024 Nobel Prize in Chemistry was awarded to David Baker (University of Washington, Howard Hughes Medical Institute), Demis Hassabis, and John M. Jumper (Google DeepMind). David Baker was recognized for computational protein design. Demis Hassabis and John Jumper were awarded for protein structure prediction using AlphaFold2.

The agent found all three laureates, their exact affiliations, and their distinct contributions — information entirely outside the model's training data — by exploring multiple search queries and accumulating verified facts in a structured notebook.

Features

Two built-in research strategies: Scratchpad (iterative search loop) and Graph Reader (graph-based web exploration).
Model-agnostic: bring your own LLMProvider adapter (OpenAI, Ollama, Anthropic, etc.). Suggestion: use llmhub to easily integrate with any model.
Swappable search providers (DuckDuckGo, Brave, Tavily) + custom SearchProvider interface.
Optional FetchProvider for reading full web pages (used by Graph Reader).
Dual-model support: use a stronger planner and a cheaper synthesizer/finalizer to save cost.
Cost tracking: accumulate LLM and search costs automatically; Result.Cost reports total spend.
Knowledge carry-over: Result.Knowledge captures the collected knowledge; pass it back via WithKnowledge to answer follow-up questions without re-searching.
Minimal dependencies (stdlib only, no vendor SDKs).
Pluggable strategy system — register your own with WithStrategyFactory.

Quick start

go get github.com/smhanov/laconic

Implement an LLMProvider adapter around your favorite client, then wire it up:

package main

import (
    "context"
    "fmt"
    "log"

    "github.com/smhanov/laconic"
    "github.com/smhanov/laconic/search"
)

func main() {
    agent := laconic.New(
        laconic.WithPlannerModel(myLLM),
        laconic.WithSynthesizerModel(myLLM),
        laconic.WithSearchProvider(search.NewDuckDuckGo()),
        laconic.WithSearchCost(0.005), // optional: cost per search call
        laconic.WithMaxIterations(5),
    )

    result, err := agent.Answer(context.Background(), "Why is the sky blue?")
    if err != nil {
        log.Fatal(err)
    }
    fmt.Println(result.Answer)
    fmt.Printf("Total cost: $%.4f\n", result.Cost)
}

A minimal hardcoded example lives in examples/basic/. Run it with:

cd examples/basic
go run .

CLI Demo

A fully functional CLI tool is available in examples/research/. It uses llmhub for provider-agnostic LLM access (OpenAI/Ollama/Anthropic/Gemini), supports multiple search providers, and supports both strategies.

# Create a prompt file
echo "Why is the sky blue?" > question.txt

# Run with your local Ollama instance (default)
go run ./examples/research/ -model mistral -prompt question.txt

# Point to a remote Ollama endpoint
go run ./examples/research/ -provider ollama -model llama3 -endpoint https://ollama.example.com -prompt question.txt

# Use an OpenAI-compatible endpoint (vLLM/Ollama/LocalAI/etc.)
go run ./examples/research/ \
    -provider openai \
    -api-key none \
    -endpoint https://vllm.example.com/v1 \
    -model default \
    -prompt question.txt

# Use the real OpenAI API
go run ./examples/research/ \
    -provider openai \
    -api-key $OPENAI_API_KEY \
    -model gpt-4o \
    -prompt question.txt

# Use the graph-reader strategy with Brave search
go run ./examples/research/ \
    -model llama3 \
    -prompt question.txt \
    -strategy graph-reader \
    -graph-max-steps 6 \
    -search brave \
    -brave-key YOUR_API_KEY

# Enable debug logging to see all LLM prompts and responses
go run ./examples/research/ -model mistral -prompt question.txt -debug

# Use template variables in prompt files
# If question.txt contains "Tell me about {{TOPIC}} in {{YEAR}}", you can fill
# the placeholders from the command line:
go run ./examples/research/ \
    -model mistral \
    -prompt question.txt \
    -var TOPIC=quantum_computing \
    -var YEAR=2025

Template variables

Prompt files can contain {{KEY}} placeholders that are replaced at runtime using the -var flag. The flag is repeatable — pass one -var KEY=VALUE for each placeholder.

For example, given a prompt file ticker.txt:

Research the stock ticker {{TICKER}} and summarize recent news.

Run it with:

go run ./examples/research/ -model mistral -prompt ticker.txt -var TICKER=AAPL

The agent will receive the fully expanded prompt: Research the stock ticker AAPL and summarize recent news.

This makes it easy to reuse the same prompt template for different inputs without editing the file each time.

CLI flags:

Flag	Default	Description
`-provider`	`ollama`	LLM provider for `llmhub` (`ollama`, `openai`, `anthropic`, `gemini`)
`-backend`		Alias for `-provider` (deprecated)
`-model`	(required)	Model name
`-endpoint`		Optional provider endpoint/base URL override
`-api-key`		API key for authenticated endpoints (e.g. OpenAI)
`-prompt`	(required)	Path to a text file containing the question
`-strategy`	`scratchpad`	Strategy: `scratchpad` or `graph-reader`
`-max-iterations`	`5`	Maximum search iterations (scratchpad)
`-graph-max-steps`	`8`	Maximum exploration steps (graph-reader)
`-search`	`duckduckgo`	Search provider: `duckduckgo` or `brave`
`-brave-key`		Brave Search API key (required with `-search brave`)
`-debug`	`false`	Print all LLM prompts and responses
`-knowledge`		Path to a file for reading/writing collected knowledge (enables follow-up questions)
`-var`		Set a template variable: `-var KEY=VALUE` (repeatable). Replaces `{{KEY}}` in the prompt file

Strategies

Laconic ships with two built-in strategies. Both compress state to stay within small context windows, but they differ fundamentally in how they plan, search, and accumulate knowledge.

Scratchpad (default)

The scratchpad strategy runs a tight Planner → Search → Synthesizer → Finalizer loop.

How it works:

A Scratchpad is initialized with the user's question. It holds four fields: OriginalQuestion, Knowledge (a free-text summary), History (a log of past searches), and IterationCount.
Each iteration, the Planner LLM examines the scratchpad snapshot and emits one of two actions:
- Action: Answer — enough information has been gathered.
- Action: Search + Query: <query> — more information is needed.
The search provider executes the query and returns a list of results (title, URL, snippet).
The Synthesizer LLM receives the existing knowledge plus the new results and rewrites the Knowledge field as a concise, deduplicated summary. Raw search results are discarded — only the compressed summary survives.
This repeats until the Planner chooses Answer or maxIterations is reached.
The Finalizer LLM (which can be the same model or a separate, stronger one) turns the final knowledge state into a user-facing answer.

Key properties:

Flat context size. Because the Synthesizer overwrites the knowledge field each iteration, prompt size stays bounded regardless of how many searches are performed. This makes it ideal for 4k/8k context models.
Grounding enforcement. The planner is instructed to never answer from internal knowledge alone — at least one search must succeed before an answer is produced. If the planner tries to answer with an empty knowledge section, the agent forces a search automatically.
Simple mental model. The loop is linear: plan, search, compress, repeat. There is no branching or backtracking.
Configurable iteration cap. Set via WithMaxIterations(n). Default is 5. If the cap is hit without a planner "Answer" decision, a best-effort finalization is returned alongside an error.

When to choose scratchpad:

Simple, single-focus questions ("What is the population of Tokyo?").
Environments with very small context windows (4k–8k tokens).
When you want minimal LLM calls and fast answers (typically 2–4 calls total).
When you don't have a FetchProvider and only need search snippets.
As a lightweight, cheap default for most use cases.

agent := laconic.New(
    laconic.WithPlannerModel(myLLM),
    laconic.WithSynthesizerModel(myLLM),
    laconic.WithSearchProvider(search.NewDuckDuckGo()),
    laconic.WithMaxIterations(5),
    laconic.WithStrategyName("scratchpad"), // this is the default
)

Graph Reader

The graph-reader strategy implements a graph-based exploration loop inspired by the GraphReader paper. Instead of a single rolling summary, it builds a notebook of atomic facts by traversing a dynamically constructed graph of search queries.

How it works:

The Planner LLM creates a Rational Plan — a structured breakdown of the question into a multi-step strategy and a list of key elements (entities, concepts, names) that need to be resolved.
From the plan, initial search queries ("nodes") are generated — typically 3–5 targeted queries covering the key elements.
The agent processes nodes from a queue in breadth-first order. For each node:
- The search provider executes the query.
- The Extractor LLM reads the search snippets and pulls out atomic facts — single, self-contained truths that directly help answer the question. Each fact is tagged with its source URL.
- If snippets are promising but incomplete, the Extractor can flag URLs for deep reading. If a FetchProvider is configured, the agent fetches full page content and extracts additional facts from it.
- Facts are deduplicated before being added to the notebook (exact matches and substring containment are both caught).
After processing each node, an Answer Check LLM evaluates whether the notebook contains enough facts to answer the question. If yes, exploration stops early.
A Neighbor Selection LLM examines the current notebook and suggests new search queries based on what has been learned and what gaps remain. These are added to the queue (skipping already-visited queries).
When exploration ends (either the answer check passes or MaxSteps is exhausted), the Finalizer LLM synthesizes a grounded answer from the notebook facts.

Key properties:

Structured fact accumulation. Knowledge is stored as a list of discrete atomic facts with source URLs, not a free-text blob. This prevents important details from being summarized away and makes the final answer more traceable.
Dynamic exploration. The graph expands based on what the agent learns — neighbor queries are generated from newly discovered facts, allowing the agent to follow chains of reasoning that weren't predictable upfront.
Deep reading. When a FetchProvider is configured, the agent can follow promising URLs and extract facts from full page content, not just search snippets. Ad and tracker URLs are automatically filtered.
Early termination. The answer check runs after each node, so the agent stops as soon as it has enough information — it won't waste calls exploring further if the notebook already covers the question.
Higher LLM cost. Each node requires multiple LLM calls (extraction, answer check, neighbor selection), so this strategy uses significantly more LLM calls than scratchpad. A typical run with MaxSteps: 6 might make 15–25 LLM calls.
Better for complex questions. Multi-entity, multi-hop, and causal reasoning questions benefit from the structured plan and fact-by-fact accumulation.

When to choose graph-reader:

Multi-hop questions that require chaining facts ("Who founded the company that acquired the maker of the drug used to treat X?").
Questions involving multiple entities that each need separate research.
When answer quality matters more than speed or cost.
When you have a FetchProvider and want the agent to read full web pages for deeper evidence.
When working with models that have 16k+ context windows (the notebook of facts can grow larger than a scratchpad summary).

agent := laconic.New(
    laconic.WithPlannerModel(myLLM),
    laconic.WithSynthesizerModel(myLLM),
    laconic.WithSearchProvider(search.NewBrave(apiKey)),
    laconic.WithFetchProvider(fetch.NewHTTP()),
    laconic.WithStrategyName("graph-reader"),
    laconic.WithGraphReaderConfig(laconic.GraphReaderConfig{MaxSteps: 8}),
)

The GraphReaderConfig struct also lets you assign different LLM providers to each role if desired:

laconic.WithGraphReaderConfig(laconic.GraphReaderConfig{
    Planner:   strongModel,   // generates the rational plan and initial queries
    Extractor: cheapModel,    // extracts atomic facts from search results / pages
    Neighbor:  cheapModel,    // suggests next queries to explore
    Finalizer: strongModel,   // writes the final answer
    MaxSteps:  10,
})

Strategy comparison

	Scratchpad	Graph Reader
State format	Free-text `Knowledge` summary	Notebook of atomic facts with source URLs
Exploration	Linear (one query at a time)	Graph-based (breadth-first with dynamic neighbors)
Context growth	Flat (summary is overwritten each step)	Grows with fact count (but stays structured)
LLM calls per run	~2–4 (plan + synthesize + finalize)	~15–25 (plan + extract × N + check × N + neighbors × N + finalize)
Deep page reading	No	Yes (via `FetchProvider`)
Early termination	Planner decides when to answer	Answer check evaluates notebook sufficiency
Best for	Simple factual questions, tight budgets	Multi-hop reasoning, complex research
Min context window	4k tokens	16k+ tokens recommended
Requires FetchProvider	No	No, but strongly recommended

Custom strategies

You can register your own strategy:

agent := laconic.New(
    laconic.WithStrategyFactory("my-strategy", func(a *laconic.Agent) (laconic.Strategy, error) {
        return &myStrategy{}, nil
    }),
    laconic.WithStrategyName("my-strategy"),
)

A Strategy must implement Name() string and Answer(ctx, question) (Result, error).

API surface

Interfaces

LLMProvider — your adapter for any language model. Single method: Generate(ctx, systemPrompt, userPrompt) (LLMResponse, error). The LLMResponse struct carries both the generated Text and a Cost (in dollars) for the call.
SearchProvider — plug any search backend. Single method: Search(ctx, query) ([]SearchResult, error).
FetchProvider — optional URL fetcher for reading full web pages. Single method: Fetch(ctx, url) (string, error).
Strategy — pluggable research loop. Methods: Name() string, Answer(ctx, question) (Result, error).

Result

Agent.Answer returns a Result struct:

type Result struct {
    Answer    string  // the final answer text
    Cost      float64 // total accumulated cost in dollars
    Knowledge string  // collected knowledge (scratchpad text or JSON notebook)
}

The Knowledge field captures the internal state accumulated during research:

Scratchpad strategy: a free-text summary produced by the synthesizer.
Graph Reader strategy: a JSON array of atomic facts ([]graph.AtomicFact).

You can pass this value back to a subsequent Answer call via WithKnowledge to support follow-up questions (see below).

Follow-up questions

After an initial research session, you can answer follow-up questions without losing the knowledge that was already gathered:

// Initial research
result, err := agent.Answer(ctx, "What is the population of Tokyo?")

// Follow-up — prior knowledge is pre-loaded into the strategy's state
followUp, err := agent.Answer(ctx,
    "How does that compare to Osaka?",
    laconic.WithKnowledge(result.Knowledge),
)

When prior knowledge is supplied:

Scratchpad pre-populates its Knowledge field, so the planner can see existing facts and decide whether to search for more.
Graph Reader pre-populates its notebook with the atomic facts, so exploration starts from an informed state.

Agent

Create with laconic.New(opts...), then call agent.Answer(ctx, question, answerOpts...) which returns a Result.

Functional options

Option	Description
`WithPlannerModel(m)`	LLM used for routing/planning decisions
`WithSynthesizerModel(m)`	LLM used for compressing search results
`WithFinalizerModel(m)`	LLM used to produce the final answer (defaults to synthesizer)
`WithSearchProvider(s)`	Search backend implementation
`WithFetchProvider(f)`	URL fetcher for full-page reading (optional)
`WithMaxIterations(n)`	Max loop iterations for scratchpad strategy (default: 5)
`WithStrategyName(name)`	Select a strategy by name: `"scratchpad"` or `"graph-reader"`
`WithStrategy(s)`	Inject a custom `Strategy` instance directly
`WithStrategyFactory(name, fn)`	Register a custom strategy factory
`WithGraphReaderConfig(cfg)`	Configure the graph-reader strategy (MaxSteps, per-role LLMs)
`WithSearchCost(cost)`	Cost in dollars charged per search call (default: 0)
`WithDebug(bool)`	Log all LLM prompts and responses to stdout

Answer options

These options are passed to individual Answer calls rather than to New:

Option	Description
`WithKnowledge(k)`	Supply prior knowledge from a previous `Result.Knowledge` value

Search providers

Provider	API key required	Notes
DuckDuckGo	No	Free; scrapes the lite HTML interface
Brave	Yes (`X-Subscription-Token`)	Fast, structured JSON API
Tavily	Yes	Supports `basic` and `advanced` depth modes

search.NewDuckDuckGo()
search.NewBrave("your-api-key")
search.NewTavily("your-api-key", "advanced")

Bring your own provider by implementing SearchProvider.

Architecture highlights

Scratchpad keeps OriginalQuestion, Knowledge, History, and IterationCount small and bounded.
Graph Reader maintains a Notebook of AtomicFact entries and a queue of Node queries with visited-set tracking.
Planner decides the next action with a compact prompt.
Synthesizer / Extractor compresses raw search results into the state representation.
Finalizer writes the user-facing answer from the accumulated knowledge.
<think> block stripping — models like Qwen3 that emit <think>...</think> reasoning blocks are handled transparently.

See detailed design in docs/architecture.md and prompt shapes in docs/prompts.md.

Testing

go test ./...

Tests use fully offline stubs; no API calls are made.

Roadmap

Add caching layer to avoid duplicate searches.
Extend planner to accept tool choices beyond search.
Provide optional streaming-friendly interfaces.

License

MIT. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
docs		docs
examples		examples
fetch		fetch
graph		graph
prompts		prompts
search		search
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
agent.go		agent.go
agent_test.go		agent_test.go
doc.go		doc.go
go.mod		go.mod
go.sum		go.sum
graph_reader_strategy.go		graph_reader_strategy.go
interfaces.go		interfaces.go
options.go		options.go
prompts.go		prompts.go
scratchpad.go		scratchpad.go
scratchpad_strategy.go		scratchpad_strategy.go
strategy.go		strategy.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

laconic

Why?

Example output

Features

Quick start

CLI Demo

Template variables

Strategies

Scratchpad (default)

Graph Reader

Strategy comparison

Custom strategies

API surface

Interfaces

Result

Follow-up questions

Agent

Functional options

Answer options

Search providers

Architecture highlights

Testing

Roadmap

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

laconic

Why?

Example output

Features

Quick start

CLI Demo

Template variables

Strategies

Scratchpad (default)

Graph Reader

Strategy comparison

Custom strategies

API surface

Interfaces

Result

Follow-up questions

Agent

Functional options

Answer options

Search providers

Architecture highlights

Testing

Roadmap

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages