Skip to content

urmzd/saige

saige

Super AI Graph Ecosystem
A unified Go SDK for streaming AI agents, knowledge graphs, and RAG pipelines.

Install · Report Bug · Go Docs

CI Go Reference License

Showcase

Basic agent demo

Features

  • Streaming-first agent loop with 15 typed delta events and parallel tool execution
  • Conversation tree with branching, checkpoints, rewind, and RLHF feedback
  • Sub-agent delegation — child agents as tools, deltas forwarded with attribution
  • Human-in-the-loop markers — gate tool execution pending approval
  • Knowledge graph construction — LLM-powered entity extraction, fuzzy dedup, temporal tracking
  • Multi-retriever RAG — vector + BM25 + graph retrieval fused via Reciprocal Rank Fusion
  • Reranking — MMR diversity and cross-encoder scoring built in
  • 4 LLM providers (Ollama, OpenAI, Anthropic, Google) behind one Provider interface
  • Provider resilience — retry + fallback composition out of the box
  • Structured output — constrain LLM responses to JSON schema

Why one SDK?

Agent orchestration, knowledge graphs, and RAG pipelines are deeply interconnected — RAG benefits from graph retrieval, agents need both for grounded responses, and all three share providers and embedders. saige unifies them under shared Provider, Embedder, and Tool interfaces, eliminating the wiring complexity of combining separate libraries.

Quick Start

go get github.com/urmzd/saige

Build an Agent

import (
    "github.com/urmzd/saige/agent"
    "github.com/urmzd/saige/agent/types"
    "github.com/urmzd/saige/agent/provider/ollama"
)

client := ollama.NewClient("http://localhost:11434", "qwen2.5", "nomic-embed-text")
a := agent.NewAgent(agent.AgentConfig{
    Name:         "assistant",
    SystemPrompt: "You are a helpful assistant.",
    Provider:     ollama.NewAdapter(client),
    Tools:        types.NewToolRegistry(myTool),
})

stream := a.Invoke(ctx, []types.Message{types.NewUserMessage("Hello!")})
for delta := range stream.Deltas() {
    switch d := delta.(type) {
    case types.TextContentDelta:
        fmt.Print(d.Content)
    }
}

Build a Knowledge Graph

import (
    "github.com/urmzd/saige/knowledge"
    "github.com/urmzd/saige/knowledge/types"
    "github.com/urmzd/saige/agent/provider/ollama"
)

client := ollama.NewClient("http://localhost:11434", "qwen2.5", "nomic-embed-text")
graph, _ := knowledge.NewGraph(ctx,
    knowledge.WithSurrealDB("ws://localhost:8000", "default", "knowledge", "root", "root"),
    knowledge.WithExtractor(knowledge.NewOllamaExtractor(client)),
    knowledge.WithEmbedder(knowledge.NewOllamaEmbedder(client)),
)
defer graph.Close(ctx)

graph.IngestEpisode(ctx, &types.EpisodeInput{
    Name: "meeting-notes",
    Body: "Alice presented the Q4 roadmap. Bob raised concerns about the timeline.",
})

results, _ := graph.SearchFacts(ctx, "Who presented the roadmap?")

Build a RAG Pipeline

import (
    "github.com/urmzd/saige/rag"
    "github.com/urmzd/saige/rag/types"
    "github.com/urmzd/saige/rag/memstore"
)

pipe, _ := rag.NewPipeline(
    rag.WithStore(memstore.New()),
    rag.WithContentExtractor(myExtractor),
    rag.WithEmbedders(myEmbedderRegistry),
    rag.WithRecursiveChunker(512, 50),
    rag.WithBM25(nil),
    rag.WithMMR(0.7),
)
defer pipe.Close(ctx)

pipe.Ingest(ctx, &types.RawDocument{
    SourceURI: "https://example.com/paper.pdf",
    Data:      pdfBytes,
})

result, _ := pipe.Search(ctx, "attention mechanism", types.WithLimit(5))
fmt.Println(result.AssembledContext.Prompt) // context with citations

Table of Contents


agent — AI Agent Framework

Streaming-first agent loop with parallel tool execution, sub-agent delegation, human-in-the-loop markers, conversation tree persistence, and multi-provider resilience.

Provider Interface

Implement one method to integrate any LLM backend:

type Provider interface {
    ChatStream(ctx context.Context, messages []Message, tools []ToolDef) (<-chan Delta, error)
}

Built-in providers:

Provider Package Structured Output Content Negotiation Embedder
Ollama agent/provider/ollama yes JPEG, PNG yes
OpenAI agent/provider/openai yes JPEG, PNG, GIF, WebP, PDF yes
Anthropic agent/provider/anthropic yes JPEG, PNG, GIF, WebP, PDF
Google agent/provider/google yes JPEG, PNG, GIF, WebP, PDF yes

Messages

Three roles. Tool results are content blocks, not a separate role.

Type Role Content Types
SystemMessage system TextContent, ToolResultContent, ConfigContent
UserMessage user TextContent, ToolResultContent, ConfigContent, FileContent
AssistantMessage assistant TextContent, ToolUseContent

Deltas

15 concrete types across five categories — LLM-side, execution-side, marker, feedback, and metadata:

Type Category Purpose
TextStartDelta LLM Text block opened
TextContentDelta LLM Text chunk
TextEndDelta LLM Text block closed
ToolCallStartDelta LLM Tool call generation started
ToolCallArgumentDelta LLM JSON argument chunk
ToolCallEndDelta LLM Tool call complete
ToolExecStartDelta Execution Tool began executing
ToolExecDelta Execution Streaming delta from tool/sub-agent
ToolExecEndDelta Execution Tool finished
MarkerDelta Marker Tool gated pending approval
FeedbackDelta Feedback RLHF rating recorded on a node
UsageDelta Metadata Token usage + wall-clock timing
ErrorDelta Terminal Provider or tool error
DoneDelta Terminal Stream complete

Tools

tool := &types.ToolFunc{
    Def: types.ToolDef{
        Name:        "greet",
        Description: "Greet a person",
        Parameters: types.ParameterSchema{
            Type:     "object",
            Required: []string{"name"},
            Properties: map[string]types.PropertyDef{
                "name": {Type: "string", Description: "Person's name"},
            },
        },
    },
    Fn: func(ctx context.Context, args map[string]any) (string, error) {
        return fmt.Sprintf("Hello, %s!", args["name"]), nil
    },
}

When the LLM requests multiple tool calls, all tools execute concurrently.

Sub-Agents

Sub-agents are registered as tools and execute within parallel tool dispatch. Their deltas are forwarded through the parent's stream:

a := agent.NewAgent(agent.AgentConfig{
    Provider: adapter,
    SubAgents: []agent.SubAgentDef{
        {
            Name:         "researcher",
            Description:  "Searches the web for information",
            SystemPrompt: "You are a research assistant.",
            Provider:     adapter,
            Tools:        types.NewToolRegistry(searchTool),
        },
    },
})

Markers (Human-in-the-Loop)

Gate tool execution pending consumer approval:

safeTool := types.WithMarkers(myTool,
    types.Marker{Kind: "human_approval", Message: "This modifies production data."},
)

// Consumer resolves:
stream.ResolveMarker(d.ToolCallID, approved, nil)

Structured Output

Constrain LLM responses to a JSON schema:

schema := types.SchemaFrom[MyResponse]()
a := agent.NewAgent(agent.AgentConfig{
    Provider:       adapter,
    ResponseSchema: schema,
})

Provider Resilience

import (
    "github.com/urmzd/saige/agent/provider/retry"
    "github.com/urmzd/saige/agent/provider/fallback"
)

provider := fallback.New(
    retry.New(primary, retry.DefaultConfig()),
    retry.New(backup, retry.DefaultConfig()),
)

Compaction

Data-driven context management:

Strategy Behavior
CompactNone No compaction
CompactSlidingWindow Keep system prompt + last N messages
CompactSummarize Summarize older messages via the provider

Conversation Tree

Persistent branching conversation graph with checkpoints, rewind, and archive:

tr := a.Tree()
tr.Branch(nodeID, "experiment", msg)
tr.Checkpoint(branchID, "before-refactor")
tr.Rewind(checkpointID)

Feedback (RLHF)

Attach positive/negative ratings and comments to any node in the conversation tree. Feedback is stored as permanent leaf nodes branching off the target — never sent to the LLM, available for post-analysis and training.

// Rate an assistant response.
tip, _ := a.Tree().Tip(a.Tree().Active())
a.Feedback(tip.ID, types.RatingPositive, "Clear and helpful")
a.Feedback(tip.ID, types.RatingNegative, "Too verbose")

// Collect all feedback across the tree.
for _, entry := range a.FeedbackSummary() {
    fmt.Printf("node=%s rating=%d comment=%q\n",
        entry.TargetNodeID, entry.Rating, entry.Comment)
}

Feedback nodes have NodeFeedback state — they cannot have children added, forming dead-end branches that don't interfere with the conversation flow. During Replay, feedback emits FeedbackDelta for consumers that track ratings.

File Pipeline

Automatic URI resolution and content negotiation for multi-modal input:

a := agent.NewAgent(agent.AgentConfig{
    Provider: adapter,
    Resolvers: map[string]types.Resolver{
        "file": myFileResolver,
        "s3":   myS3Resolver,
    },
    Extractors: map[types.MediaType]types.Extractor{
        types.MediaPDF: myPDFExtractor,
    },
})

TUI

Two display modes for streaming agent progress:

import "github.com/urmzd/saige/agent/tui"

// Non-interactive (works in pipes/CI)
result := tui.StreamVerbose(header, stream.Deltas(), os.Stdout)

// Interactive (bubbletea)
model := tui.NewStreamModel(header, stream.Deltas())
tea.NewProgram(model).Run()

Testing

import "github.com/urmzd/saige/agent/agenttest"

provider := &agenttest.ScriptedProvider{
    Responses: [][]types.Delta{
        agenttest.ToolCallResponse("id-1", "greet", map[string]any{"name": "Alice"}),
        agenttest.TextResponse("Hello, Alice!"),
    },
}

knowledge — Knowledge Graph SDK

Build and query knowledge graphs with LLM-powered entity extraction, fuzzy deduplication, and hybrid search.

Graph Interface

type Graph interface {
    ApplyOntology(ctx, ontology) error
    IngestEpisode(ctx, episode) (*IngestResult, error)
    GetEntity(ctx, uuid) (*Entity, error)
    SearchFacts(ctx, query, opts...) (*SearchFactsResult, error)
    GetGraph(ctx) (*GraphData, error)
    GetNode(ctx, uuid, depth) (*NodeDetail, error)
    GetFactProvenance(ctx, factID) ([]Episode, error)
    Close(ctx) error
}

Core Types

Type Purpose
Entity Node — UUID, Name, Type, Summary, Embedding
Relation Edge — Source/Target UUID, Type, Fact, ValidAt/InvalidAt
Fact Relation with resolved source/target entities
Episode Text input with Name, Body, Source, GroupID, Metadata
Ontology Schema constraints — EntityTypes, RelationTypes

Hybrid Search

Combines vector similarity (HNSW) and full-text (BM25) via Reciprocal Rank Fusion:

results, _ := graph.SearchFacts(ctx, "Who works at Acme?",
    types.WithLimit(10),
    types.WithGroupID("project-alpha"),
)
for _, fact := range knowledge.FactsToStrings(results.Facts) {
    fmt.Println(fact) // "Alice -> Acme Corp: works at"
}

Deduplication

  • Exact match by (name, type) pair
  • Fuzzy match via Levenshtein distance (threshold 0.8)
  • Relation dedup by text similarity (threshold 0.92)

Graph Traversal

detail, _ := graph.GetNode(ctx, entityUUID, 2) // BFS to depth 2
sub := knowledge.Subgraph(detail)                      // extract visualization data

SurrealDB Backend

Automatic schema provisioning with HNSW vector index (768D cosine), BM25 fulltext indexes, unique constraints, and temporal tracking.


rag — RAG Pipeline SDK

Multi-modal document ingestion with pluggable chunking, retrieval, reranking, and context assembly.

Data Model

Document (fingerprint for dedup, metadata, source URI)
  └── Section[] (ordered by index, optional heading)
        └── ContentVariant[] (text, image, table, audio — each with bytes, embedding, MIME)

Every ContentVariant has a .Text field that is always populated, enabling uniform search and entity extraction.

Pipeline Interface

type Pipeline interface {
    Ingest(ctx, raw) (*IngestResult, error)
    Search(ctx, query, opts...) (*SearchPipelineResult, error)
    Lookup(ctx, variantUUID) (*SearchHit, error)
    Update(ctx, documentUUID, raw) (*IngestResult, error)
    Delete(ctx, documentUUID) error
    Reconstruct(ctx, documentUUID) (*Document, error)
    Close(ctx) error
}

Chunking

Strategy Description
Recursive Tries separators (\n\n, \n, . , ) with configurable overlap
Semantic Splits where embedding similarity drops below threshold
rag.WithRecursiveChunker(512, 50)     // maxSize, overlap
rag.WithSemanticChunker(0.1, 100, 1000) // threshold, minSize, maxSize

Retrieval

Retriever Description
Vector Embed query, cosine similarity search
BM25 In-memory inverted index with configurable K1/B
Graph Knowledge graph facts resolved to document variants via episode provenance
Parent Wraps any retriever, expands hits to full parent section context

Multiple retrievers are combined via Reciprocal Rank Fusion.

rag.WithBM25(nil)          // default K1=1.2, B=0.75
rag.WithParentContext()    // expand to parent sections

Reranking

Reranker Description
MMR Maximal Marginal Relevance — balances relevance and diversity
Cross-Encoder Pair-wise scoring via custom Scorer interface
rag.WithMMR(0.7)                    // lambda=0.7
rag.WithCrossEncoder(myScorer)      // custom scorer

Context Assembly

Built-in citation support:

// Default: numbered citations with source URIs
// Compressing: LLM-based extraction of relevant sentences
rag.WithCompression(myLLM)

Query Transformation

HyDE (Hypothetical Document Embeddings) — generates hypothetical documents via LLM for better retrieval:

rag.WithHyDE(myLLM, 3) // generate 3 hypothetical docs

Evaluation Metrics

import "github.com/urmzd/saige/rag/rageval"

precision := rageval.ContextPrecision(results, relevantUUIDs)
recall := rageval.ContextRecall(results, relevantUUIDs)
faithfulness, _ := rageval.Faithfulness(ctx, llm, query, answer, context)
relevancy, _ := rageval.AnswerRelevancy(ctx, embedder, query, answer)

Agent Tool Bindings

5 tools for integrating RAG into agent workflows:

import "github.com/urmzd/saige/rag/adktool"

tools := adktool.NewTools(pipeline)
// rag_search, rag_lookup, rag_update, rag_delete, rag_reconstruct

Examples

Example Path Description
Basic Agent examples/agent/basic/ Single tool with Ollama
Sub-agents examples/agent/subagents/ Parent delegating to researcher
Resilient examples/agent/resilient/ Retry + fallback composition
Streaming examples/agent/streaming/ All delta types with ANSI output
Multimodal examples/agent/multimodal/ File pipeline with file:// resolver
TUI examples/agent/tui/ Interactive and verbose modes
Runner examples/agent/runner/ Multi-turn conversation loop
Concurrent examples/agent/concurrent-subagents/ Parallel sub-agent execution
Knowledge Graph examples/knowledge/basic/ Build and query a knowledge graph
RAG examples/rag/arxiv/ Full pipeline with arXiv papers
go run ./examples/agent/basic/
go run ./examples/knowledge/basic/
go run ./examples/rag/arxiv/

Agent Skill

npx skills add urmzd/saige

Architecture

graph TB
    subgraph agent["agent/ -- AI Agent Framework"]
        agenttypes["agent/types/<br/>Provider, Tool, Delta,<br/>Message, Node, WAL"]
        agentloop["agent/<br/>Agent loop, streaming,<br/>sub-agents"]
        providers["agent/provider/<br/>ollama, openai,<br/>anthropic, google"]
        resilience["agent/provider/<br/>retry, fallback"]
        tree["agent/tree/<br/>Conversation graph"]
        tui["agent/tui/<br/>Terminal UI"]
        agenttest["agent/agenttest/<br/>Test utilities"]
    end

    subgraph kg["knowledge/ -- Knowledge Graph"]
        kgtypes["knowledge/types/<br/>Graph, Store, Extractor"]
        engine["knowledge/internal/engine/<br/>Extraction, dedup"]
        surrealdb["knowledge/surrealdb/<br/>SurrealDB backend"]
    end

    subgraph rag["rag/ -- RAG Pipeline"]
        ragtypes["rag/types/<br/>Pipeline, Store, Retriever"]
        pipeline["rag/internal/pipeline/<br/>Ingest, search, RRF"]
        retrievers["rag/vector, bm25,<br/>parent, graph retrievers"]
        rerankers["rag/reranker/<br/>MMR, cross-encoder"]
        chunkers["rag/chunker/<br/>Recursive, semantic"]
        adktool["rag/adktool/<br/>Agent tool bindings"]
    end

    agentloop --> agenttypes
    providers --> agenttypes
    resilience --> providers
    tree --> agenttypes
    tui --> agentloop

    engine --> kgtypes
    surrealdb --> kgtypes

    pipeline --> ragtypes
    retrievers --> ragtypes
    rerankers --> ragtypes
    chunkers --> ragtypes

    adktool --> ragtypes
    adktool -.->|integrates| agenttypes
    retrievers -.->|graphretriever| kgtypes
Loading
Package Files Purpose
agent/ agent.go, stream.go, subagent.go, aggregator.go, runner.go Agent loop, streaming, sub-agent delegation
agent/types/ message.go, delta.go, content.go, provider.go, tool.go, errors.go, marker.go, compactor.go, node.go Sealed types, interfaces, error classification, feedback
agent/tree/ tree.go, flatten.go, compact.go, diff.go Branching conversation tree with feedback leaf nodes
agent/provider/ ollama/, openai/, anthropic/, google/, retry/, fallback/ LLM adapters and resilience wrappers
agent/tui/ stream.go, styles.go, runner.go Bubbletea + verbose streaming UI
agent/agenttest/ agenttest.go ScriptedProvider, MockTool, assertions
knowledge/ config.go, query.go, ollama.go Knowledge graph public API
knowledge/types/ types.go Core knowledge graph types and interfaces
knowledge/surrealdb/ store.go, schema.go, records.go SurrealDB store implementation
knowledge/internal/ engine/, extraction/, fuzzy/ Engine orchestration, LLM extraction, dedup
rag/ config.go, version.go RAG pipeline configuration
rag/types/ types.go Core RAG types and interfaces
rag/internal/ pipeline/pipeline.go Pipeline engine (ingest, search, RRF)
rag/chunker/ chunker.go, semantic.go Recursive and semantic chunking
rag/bm25retriever/ retriever.go In-memory BM25 lexical search
rag/vectorretriever/ retriever.go Vector similarity search
rag/graphretriever/ retriever.go Knowledge graph retrieval
rag/parentretriever/ retriever.go Parent context expansion
rag/reranker/ mmr.go, crossencoder.go MMR + cross-encoder reranking
rag/hyde/ transformer.go HyDE query expansion
rag/contextassembler/ compressing.go LLM-based context compression
rag/rageval/ eval.go Evaluation metrics
rag/adktool/ tools.go Agent tool bindings
rag/memstore/ store.go In-memory store for testing

License

Apache 2.0 — see LICENSE.

About

saige — Super AI Graph Ecosystem. A unified Go SDK for streaming AI agents, knowledge graphs, and RAG pipelines.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages