29 Mar 22:57

SomeOddCodeGuy

1df4274

v0.6 - Multi-user improvements, more memory and consistency improvements, and lots of bug fixes Latest

Latest

v0.6 - March 2026

Major New Features

ContextCompactor Workflow Node — New node type that summarizes conversation messages into two rolling summaries (Old + Oldest) using token-based windowing. Separate from the memory system; designed for recency-aware conversation compaction. Uses XML-style tags and configurable via a settings file.
Automatic Memory Condensation — Optional condensation layer for file-based memories. After enough new memories accumulate (configurable threshold), the oldest batch is LLM-summarized into a single condensed entry, reducing file bloat over long conversations.
Per-Message Image Association — Major refactor replacing synthetic {"role": "images"} messages with a per-message "images" key. Images now stay associated with their originating message from ingestion through to LLM dispatch. Includes OpenAI multimodal content parsing on ingestion.
Claude API Image Support — Full image support for the Claude handler. Supports base64, data URIs, and HTTP URLs. Uses PIL/Pillow for format detection (optional, falls back to JPEG). Images placed before text per Anthropic recommendation.
Per-User Encryption — When an API key is provided via Authorization: Bearer, files are stored in isolated per-key directories. Optional Fernet encryption (AES-128-CBC + HMAC-SHA256, PBKDF2 key derivation) can be enabled per user. Transparent plaintext-to-encrypted migration. Includes a re-keying script.
Multi-User Support — A single WilmerAI instance can now serve multiple users via repeated --User flags. Full per-user isolation: per-user config reads, request-scoped user identification, per-user log directories, aggregated models/tags endpoints.
WSGI Concurrency Limiting Middleware — New --concurrency (default: 1) and --concurrency-timeout (default: 900s) CLI flags on all entry points. Requests exceeding the limit queue until a slot opens or timeout (503). Implemented at the WSGI layer so the semaphore holds across streaming responses.

Bug Fixes

SillyTavern Streaming Hang — Fixed streaming hang when using SillyTavern as a front end.
Open WebUI Streaming Error — Restored JSON heartbeat format (was changed to bare newline, causing JSONDecodeError in Open WebUI's NDJSON parser).
Memory Generation Stalling — Fixed memory generation never triggering after the first run due to empty-message hash collision when front-end injects Author's Note with only a [DiscussionId] tag.
GetCurrentMemoryFromFile Returning Wrong Data — Was sharing a code path with GetCurrentSummaryFromFile and returning rolling chat summary instead of memory chunks. Now correctly returns memory chunks.
Image Lookback Default Regression — Restored default lookback window from 5 back to 10 (was silently halved).
Multi-Word Prefix Detection in Streaming — Fixed StreamingResponseHandler failing to strip multi-word response prefixes (e.g., "AI: ").
Data URI Stripping Before LLM Dispatch — Hardened image key stripping to cover all image formats when llm_takes_images is False.

Hardening and Security

Dependency Pinning — All dependencies pinned to exact versions (==) to mitigate supply chain attacks. Updated several packages including requests (CVE fix), cryptography (reverted to 46.0.5, pre-supply-chain-attack window).
Thread Safety — Per-discussion locks in timestamp service, context compactor, and RAG tool. Thread-safe globals via threading.local(). Lock dictionaries capped at 500 with LRU eviction. Atomic file writes (temp + rename).
Sensitive Logging / Prompt Redaction — New sensitive_logging_utils module. All log statements exposing user content converted to redactable versions. Redaction activates when encryption is enabled or redactLogOutput: true.
JSON Parsing Hardening — Incoming API handlers now use get_json(force=True, silent=True) returning 400 instead of unhandled 500 on invalid JSON.
Configurable Categorization Retries — Removed hardcoded 4-retry loop; now configurable via maxCategorizationAttempts (default: 1).

Code Quality

Optimized variable generation — Conversation-slice variables only computed when referenced in the prompt.
Lazy-load time_context_summary — Skips file I/O when the variable isn't referenced.

Assets 2

09 Feb 03:26

SomeOddCodeGuy

v0.5

4771775

v0.5 - Better message variables for prompts, some new nodes, and memory fixes

Summary

NOTE: This introduces new variables to help deprecate variables like "chat_user_prompt_last_twenty". I'm not getting rid of those, for backwards compatibility purposes, but going forward we don't need them as much.

New Workflow Nodes

JsonExtractor node: extracts fields from JSON in LLM responses without an additional LLM call
TagTextExtractor node: extracts content between XML/HTML-style tags without an additional LLM call

Configurable Prompt Variables

nMessagesToIncludeInVariable: node property to control how many messages are included in chat/templated prompt variables
estimatedTokensToIncludeInVariable: token-budget-based message selection, accumulates recent messages up to a token limit
minMessagesInVariable + maxEstimatedTokensInVariable: combo mode pulling a minimum message count then filling up to a token budget

Token Estimation

Recalibrated rough_estimate_token_length word ratio (1.538 -> 1.35 tokens/word)
Added configurable safety_margin parameter (default 1.10)

Memory System Fixes

Fixed file_exists check that was permanently disabling message-threshold triggers for new conversations
Fixed off-by-one in trigger comparisons (> to >=)
Added HTTP session cleanup via close() to prevent keep-alive connections from blocking llama.cpp slots
Split timeouts into (connect, read) tuples
Added diagnostic logging for memory trigger decisions

Code Quality

Fixed bare except clauses to except Exception in cancellation paths
Added prompt-aware info logging for configurable variable slicing

Example Workflow Configs

Updated all example workflow JSON files to use new configurable variable syntax

Assets 2

05 Jan 03:53

SomeOddCodeGuy

v0.4.1

a437d1e

v0.4.1 - Small hotfix for memories

What's Changed

Corrected an issue with memory system due to recent change removing the imagespecific handlers. by @SomeOddCodeGuy in #82

Contributors

SomeOddCodeGuy

Assets 2

04 Jan 21:26

SomeOddCodeGuy

v0.4

f9f2a6e

v0.4 - Workflow collections, bug fixes, test UI, and some simplification

What's Changed

Fix oldest message chunk being silently discarded in memory generation
Fix incorrect new message count causing duplicate processing of memorized messages
Fix pytest.ini test path case sensitivity

Features:

Add shared workflow collections and workflow selection via API model field (/v1/models and /api/tags endpoints)
Add workflow node execution summary logging with timing info
Add workflowConfigsSubDirectoryOverride for shared workflow folders
Add sharedWorkflowsSubDirectoryOverride for custom shared folder names
Add {Discussion_Id} and {YYYY_MM_DD} variables for file paths
Add variable substitution support for maxResponseSizeInTokens
Add web-based setup wizard (setup_wizard_web.py) (this is a WIP and may be temporary/replaced)
Add vector memory resumability with per-chunk hash logging

Refactors:

Consolidated image handlers into standard handlers (remove ~700 lines)
Standardize preset/workflow naming convention (hyphenated)
Archive legacy workflows to _archive subdirectories
Add pre-configured shared workflow folders

Simplification:

Updated preset names to match endpoint names. Now it makes more sense, as you can more easily use presets to make sure each endpoint gets the appropriate settings.
The _example_general_workflow is the one stop shop for example productivity workflows, and thanks to the custom workflow system its easier to spin more off easily. You can just drop new folders into _shared within workflows and suddenly have new workflows available to you as models. I'll make a video about this later.
Dropped the imagespecific handlers. Finally. Those were something I did early on and I just kept putting off dealing with them, but they always annoyed me. Regular handlers have the image frameworks added in, if they support it.

Tests:

Update tests for corrected memory hash behavior
Added tests for new workflow override features

Assets 2

07 Dec 23:14

SomeOddCodeGuy

v0.3.1

ac447fc

v0.3.1

What's Changed

Updating urllib3 to correct a dependabot issue by @SomeOddCodeGuy in #79

Full Changelog: v0.3.0...v0.3.1

Contributors

SomeOddCodeGuy

Assets 2

13 Oct 02:50

SomeOddCodeGuy

v0.3.0

8b4963b

v0.3.0 - API swapped, Claude Support added, other fixes

Added support for the Claude llm_api
Replaced Flask exposed runnable api with Eventlet for MacOS/Linux, and Waitress for Windows
Fixed the unit tests not running in Windows properly
Corrected two places where a slash at the end of the llm_api url and at the end of the ConfigDirectory folder name caused a break
Added attempt at proper cancellation ability, where pressing "stop" in open webui or other front-ends will appropriately end a workflow and cascade down to an LLM
- Some llm apis work with this, some don't. This should appropriately kill Wilmer and its workflows, but an LLM api in the middle of processing a prompt may not be compelled to stop.
Added the ability to replace Endpoints and Presets with variables
- Limited to hardcoded variables at top of workflow, or agentXInputs from parent workflows

Assets 2

29 Sep 03:47

SomeOddCodeGuy

v0.2.1

e4bb7df

v0.2.1 - New nodes, bug fixes, new docs, and first recursive workflow

What's Changed

Added new LLM assisted workflow generation document folder. This is a work in progress.

This is still a work in progress, but I have successfully generated a few workflows with this. This is a start in the direction I want to take Wilmer of having its setup and workflow generation be something an LLM can automate easily.

Fixed streaming on the static response node
Update partial article wiki node to return N number of results
Bugfix for thinking tag cleaning. We had a situation where an LLM (magistral 2509) was accounting for thinking tags but not generating any. This resulted in completely empty responses going into agentXOutput, as the whole response was being deleted.
Added ArithmeticProcessor node
Added Conditional node
Added StringConcatenator Node
Updated Conditional Workflows to allow a content passthrough on default instead of having to go into a workflow
Added POC for recursive workflow, doing a simple coding workflow as an example. There's a wikipedia workflow coming next, but I want to test it a little more before pushing it out.

Full Changelog: v0.2...v0.2.1

Assets 2