Skip to content

inhuman/memorialiste

Repository files navigation

🇬🇧 English · 🇷🇺 Русский

memorialiste

La mémorialiste visits your repository, reads what changed since its last visit, writes the missing chapters of your project's story, and leaves a merge request behind.

A one-shot CLI tool that keeps documentation up-to-date with source code changes. Each run computes a git diff since the last documentation update, calls an OpenAI-compatible LLM to rewrite the affected docs, and opens a Merge/Pull Request.

How it works (and why the manifest matters)

memorialiste does NOT regenerate docs from scratch every time. Instead it keeps each doc file in sync with a specific slice of the source tree, and only redoes the work when that slice changed.

The link between docs and code lives in a single file: docs/.docstructure.yaml. It lists every doc file you want managed by the tool, and tells it which source paths each doc cares about:

docs:
  - path: docs/user/guide.md
    audience: end users
    covers: [cmd/, cliconfig/]
    description: User-facing CLI guide.

  - path: docs/architecture.md
    audience: developers
    covers: [context/, generate/, output/, platform/]
    description: Internal architecture: packages, abstractions, data flow.

Each run, for every entry, memorialiste:

  1. Reads the generated_at watermark from the doc file's frontmatter (the commit SHA the doc was last regenerated against).
  2. Computes a git diff filtered to that entry's covers paths only.
  3. If the diff is empty, skips this doc entirely — no LLM call, no wasted tokens.
  4. Otherwise feeds the diff + the current doc body to the LLM, writes the refreshed body back with the bumped watermark.

This is why the manifest is non-negotiable: without per-doc covers, every doc would see every change, the LLM would burn tokens deciding what's relevant, and the user guide would get rewritten whenever you edit internal plumbing.

The audience field is also used to name the auto-created branch (docs/memorialiste-developers, docs/memorialiste-end-users, …) so MR lists stay readable.

Installation

docker pull idconstruct/memorialiste:latest

Pin a specific version for reproducibility:

docker pull idconstruct/memorialiste:v0.3.1

Usage

GitLab CI

update-docs:
  image: idconstruct/memorialiste:latest
  variables:
    MEMORIALISTE_AST_CONTEXT: "true"
  script:
    - memorialiste
      --provider-url "$OLLAMA_URL"
      --model "qwen3-coder:30b"
      --platform gitlab
      --platform-token "$GITLAB_TOKEN"
      --project-id "$CI_PROJECT_ID"
      --dry-run=false
  rules:
    - if: $CI_COMMIT_BRANCH == "main"

GitHub Actions

- name: Update docs
  run: |
    docker run --rm --network=host \
      -v ${{ github.workspace }}:/repo \
      -e MEMORIALISTE_PLATFORM_TOKEN=${{ secrets.GITHUB_TOKEN }} \
      idconstruct/memorialiste:latest \
      --repo /repo \
      --provider-url "$OLLAMA_URL" \
      --model qwen3-coder:30b \
      --platform github \
      --project-id "${{ github.repository }}" \
      --dry-run=false \
      --ast-context

Local dry-run

docker run --rm --network=host --user $(id -u):$(id -g) \
  -v $(pwd):/repo \
  idconstruct/memorialiste:latest \
  --repo /repo \
  --provider-url http://localhost:11434 \
  --model qwen3-coder:30b \
  --ast-context

Using Claude, Gemini, GPT-4 and other models

memorialiste talks to LLMs exclusively via the OpenAI-compatible /v1/chat/completions API. There is no native Anthropic / Google / OpenAI SDK. To use any non-Ollama model, run an OpenAI-compatible proxy that translates requests to the target provider's native API. memorialiste itself needs zero changes — you point --provider-url at the proxy and adjust --model.

Self-hosted: LiteLLM

LiteLLM supports ~100 models (Claude, Gemini, Bedrock, Vertex AI, etc.) and runs as a Docker sidecar.

# docker-compose.yml
services:
  litellm:
    image: ghcr.io/berriai/litellm:main-latest
    ports: ["4000:4000"]
    environment:
      ANTHROPIC_API_KEY: ${ANTHROPIC_API_KEY}
      OPENAI_API_KEY:    ${OPENAI_API_KEY}
memorialiste --provider-url http://litellm:4000 --model claude-3-5-sonnet-20241022

Self-hosted: one-api

one-api — aggregator with a web UI, same OpenAI-compat surface. Point --provider-url at its base URL.

Cloud: OpenRouter

OpenRouter routes to Claude, GPT-4, Gemini and many others via a single OpenAI-compat endpoint.

memorialiste \
  --provider-url https://openrouter.ai/api/v1 \
  --api-key "$OPENROUTER_API_KEY" \
  --model anthropic/claude-3.5-sonnet

The --api-key value is sent as Authorization: Bearer <key> to the provider. Combine with --model-params to tune temperature, top_p, etc.

CLI Flags & Environment Variables

All flags can be set via environment variables (uppercase snake_case with MEMORIALISTE_ prefix). Flags take precedence over env vars.

Flag Env var Default Description
--provider-url MEMORIALISTE_PROVIDER_URL http://localhost:11434 OpenAI-compatible base URL
--model MEMORIALISTE_MODEL qwen3-coder:30b Model tag
--model-params MEMORIALISTE_MODEL_PARAMS "" Extra model params JSON (e.g. {"temperature":0.2})
--system-prompt MEMORIALISTE_SYSTEM_PROMPT built-in System prompt literal OR @path/to/file
--prompt MEMORIALISTE_PROMPT "" Additional user prompt appended after diff
--language MEMORIALISTE_LANGUAGE english Output language; substituted into {language} placeholder
--api-key MEMORIALISTE_API_KEY "" Bearer token for the LLM provider
--doc-structure MEMORIALISTE_DOC_STRUCTURE docs/.docstructure.yaml Path to the doc structure manifest
--repo MEMORIALISTE_REPO . Local git repository root
--token-budget MEMORIALISTE_TOKEN_BUDGET 12000 Max diff tokens before summarisation kicks in
--dry-run MEMORIALISTE_DRY_RUN true Write files locally; skip branch+commit+platform
--branch-prefix MEMORIALISTE_BRANCH_PREFIX docs/memorialiste- Branch name prefix
--ast-context MEMORIALISTE_AST_CONTEXT false Enable AST-enriched diff context via grep-ast
--code-search MEMORIALISTE_CODE_SEARCH false Expose the AST search_code tool to the LLM (function calling)
--code-search-max-turns MEMORIALISTE_CODE_SEARCH_MAX_TURNS 10 Max tool-call turns before aborting
--repo-meta MEMORIALISTE_REPO_META basic Repo metadata level: basic or extended
--watermarks-file MEMORIALISTE_WATERMARKS_FILE "" Sidecar YAML file storing generated_at watermarks; when empty, watermarks live in doc frontmatter
--llm-timeout MEMORIALISTE_LLM_TIMEOUT 5m Per-request timeout for LLM provider HTTP calls (e.g. 5m, 30s, 10m30s). Overridable per-doc via manifest llm_timeout.
--platform-timeout MEMORIALISTE_PLATFORM_TIMEOUT 60s Per-request timeout for GitLab/GitHub HTTP calls and git push
--ast-parse-timeout MEMORIALISTE_AST_PARSE_TIMEOUT 5s Per-file timeout for Go AST parsing inside the search_code tool
--platform MEMORIALISTE_PLATFORM gitlab gitlab or github
--platform-url MEMORIALISTE_PLATFORM_URL platform default Base URL for self-hosted instances
--platform-token MEMORIALISTE_PLATFORM_TOKEN required (non-dry-run) Platform access token
--project-id MEMORIALISTE_PROJECT_ID required (non-dry-run) GitLab project ID or owner/repo
--base-branch MEMORIALISTE_BASE_BRANCH main Target branch for the opened MR/PR
--version Print version and exit
--help Show grouped help

Watermark Format

Every generated doc file carries YAML frontmatter:

---
generated_at: abc1234def5
---

# Your Doc Title
...

The tool reads generated_at to compute the diff since the last run. A file without frontmatter is treated as never generated (full-repo diff scoped to the entry's covers paths).

Sidecar watermarks (clean Markdown)

To keep generated Markdown free of frontmatter, set watermarks_file in the manifest (globally under defaults: or per-entry):

defaults:
  watermarks_file: .memorialiste-watermarks.yaml
docs:
  - path: docs/architecture.md
    covers: [internal/]

In sidecar mode each doc file is written verbatim and the generated_at SHA per file is stored in the sidecar YAML:

- path: docs/architecture.md
  generated_at: abc1234def5

Migration between modes is bidirectional and lossless in one transition run: if a doc has frontmatter but the sidecar lacks a record, the frontmatter is used; if the doc has no frontmatter but another entry's sidecar holds a record, that one is used. The next write places the watermark in the canonical location declared by the manifest.

Per-Doc Overrides

Entries (and an optional defaults: block) may override the following fields, which are otherwise taken from CLI flags or env vars:

model, model_params, language, system_prompt, prompt, ast_context, code_search, code_search_max_turns, repo_meta, token_budget, watermarks_file.

Precedence (lowest → highest): kong defaults < manifest defaults < manifest per-doc entry < env var (MEMORIALISTE_*) < explicit CLI flag.

Repository Metadata

The LLM receives a compact metadata block prepended to its prompt so it can write accurate version numbers:

=== Repository metadata ===
Latest tag: v0.3.1
HEAD: 53ebb4b...
Short SHA: 53ebb4b
=== End metadata ===

--repo-meta=extended adds remote URL (token-redacted), branch, last 5 tags with dates — useful for CHANGELOG / release-notes documents.

AST-Enriched Context

--ast-context runs every changed file through grep-ast's TreeContext renderer, so the model sees enclosing function signatures and surrounding code structure instead of raw +/- lines. Significantly improves quality for code-heavy docs.

AST Code Search

--code-search exposes a search_code function-calling tool to the LLM. Mid-generation the model may ask for any Go declaration in the repo by regex name match; the tool returns the matched function, method, type, const, or var bodies with file paths and line ranges. Useful when the diff alone lacks context (e.g. a doc covering one package references symbols defined in another).

Bounded by --code-search-max-turns (default 10) and a per-file 5s parse timeout. Provider must implement OpenAI-style function calling and emit proper tool_calls (not stringified JSON in content). Verified working on local Ollama: qwen3:14b, qwen3.6:35b, gpt-oss:120b. Models that return finish_reason: stop with a JSON blob in content (e.g. qwen2.5-coder:7b, sometimes qwen3-coder:30b with large contexts) do not follow the API correctly — memorialiste prints a WARNING — the model did not call any tools line and proceeds with diff-only output. If the provider rejects a tools-shaped request entirely, memorialiste fails fast with an actionable error suggesting --code-search=false.

Tip — when to combine with --ast-context: AST context already embeds the enclosing function/method around every changed line, so tool-capable models often skip search_code entirely when AST is on. Use --code-search ALONE when you want the model to pull in declarations referenced by the diff but defined far away from it; use both flags together for the most thorough context (the model picks what it needs).

Architecture Diagrams

The built-in system prompt encourages the LLM to emit Mermaid diagrams (```mermaid fenced blocks) when the diff touches architecture, data flow, or component relationships. GitLab and GitHub render Mermaid natively in Markdown previews. No rendering toolchain required.

Runtime Dependencies

The Docker image bundles:

Tool Version Purpose
grep-ast 0.5.0 AST-enriched diff context (--ast-context)
tree-sitter 0.20.4 Required by grep-ast
tree-sitter-languages 1.10.2 Language grammars for grep-ast

These are only invoked when --ast-context is enabled.

Examples

See examples/ for ready-to-run scenarios:

Scenario What it shows
01-user-guide Plain end-user guide; built-in prompt; minimal config
02-architecture Developer-facing architecture overview with AST + Mermaid
03-developer-onboarding Custom system prompt for contributor onboarding
04-ai-readable Dense LLM-readable project context (think CLAUDE.md)
05-russian-docs --language russian (works for any language)
06-changelog CHANGELOG via --repo-meta=extended
07-codesearch --code-search — model pulls Go declarations via function calling
ci-gitlab Drop-in .gitlab-ci.yml
ci-github Drop-in GitHub Actions workflow

Every doc-scenario folder contains an executable run.sh that you can invoke locally against a running Ollama.

Library Usage

memorialiste is also a Go library — use manifest, context, generate, output, and platform packages directly. See package godoc.

import (
    "context"
    "github.com/inhuman/memorialiste/manifest"
    mctx "github.com/inhuman/memorialiste/context"
)

m, _ := manifest.Parse("docs/.docstructure.yaml")
dc, _ := mctx.Assemble(context.Background(), m.Docs[0], mctx.Options{
    RepoPath:    ".",
    ASTContext:  true,
    TokenBudget: 12000,
})
fmt.Println(dc.Diff)

Contributing

Bug reports, feature requests, and pull requests are welcome.

  • Bug? Use the bug report template — capture the exact CLI invocation, env vars, and log lines.
  • Feature idea? Use the feature request template — describe the scenario, not just the wish.
  • Want to send a PR? Read CONTRIBUTING.md first — it covers the dev loop, project conventions, commit-message style, and the pre-release local Docker smoke that every release tag must pass.

Quick dev loop:

git clone https://github.com/inhuman/memorialiste.git
cd memorialiste
go test ./...
go vet ./...
docker build -t memorialiste:dev --build-arg VERSION=dev .

About

La mémorialiste visits your repository, reads what changed since its last visit, writes the missing chapters of your project's story, and leaves a merge request behind.

Resources

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors