Minimal Hermes config:
memory:
provider: continuityHermes passes provider-specific configuration to ContinuityProvider.initialize() as continuity_config. The intended YAML shape is:
memory:
provider: continuity
continuity:
max_records: 5
max_chars: 1800
min_score: 0.1
include_sources: true
include_scores: false
save_turn_observations: true
consolidate_on_session_end: true
consolidate_min_turns: 3
session_strategy: per-repo
project_scope_boost: 1.0
stale_after_days: 180
observation_max_chars: 4000
vector:
enabled: false
backend: none
embedding_model: null
batch_size: 32
normalize: true
reranker:
enabled: false
backend: none
model: null
min_score: 0.15
weight: 2.0
top_k: 20
association_graph:
enabled: true
edge_creation_threshold: 0.25
out_degree_cap: 20
hop_decay: 0.6
activation_threshold: 0.3
max_hops: 2
retrieval_gates:
min_base_relevance: 0.35
min_vector_relevance: 0.05
min_vector_only_relevance: 2.75
weak_vector_only: true
facet_compatibility: true
memory_meta: true
low_information: true
project_status_anchor: true
personal_project: true
activation_anchor: true
strong_personal_anchor_reranker_bypass: true
pinned_system_prompt:
enabled: true
include_records: true
max_chars: 1200
max_records: 5
kinds: [user_preference, project_convention]
min_importance: 0.75
dream:
enabled: false
schedule: "03:30"
timezone: "UTC"
mode: "report"
max_archives_per_run: 20
run_on_initialize: false
run_on_turn: falseThe recommended production setup is continuity-exclusive: configure Hermes with
memory.provider: continuity and treat continuity as the primary memory source.
Stock Hermes memory files can be imported or mirrored for migration, but they
should not remain a competing always-on prompt context. Tier 0 pinned continuity
records replace the tiny trusted memory kernel; ordinary continuity retrieval
handles the long tail per turn.
Keep Tier 0 small and auditable. It should contain durable high-trust facts such as stable user preferences and important project conventions, not broad logs or large status dumps.
Maximum selected continuity records to consider for prefetch output.
Default: 5.
Hard character budget for injected prefetch output.
Default: 1800. Values below 200 are clamped to 200.
Minimum retrieval score required before a record can be included in prefetch.
Default: 0.1.
Whether prefetch output includes source refs.
Default: true.
Whether prefetch output includes retrieval scores.
Default: false. Enable temporarily when debugging retrieval quality.
Whether sync_turn() stores sanitized per-turn observations for later consolidation.
Default: true.
Whether on_session_end() extracts explicit durable memories from marker phrases.
Default: true.
Current markers:
Remember: ...
Lesson learned: ...
Decision: ...
Minimum number of user turns before session-end extraction runs.
Default: 3.
Accepted values:
per-repo
per-directory
global
per-session
Default: per-repo.
Score boost for records scoped to the current project.
Default: 1.0.
Maximum stored characters per user/assistant observation field.
Default: 4000.
Age threshold used by continuity_prune_stale for conservative low-importance candidate detection.
Default: 180.
This is not an automatic destructive retention policy. Normal retrieval excludes records with status != active and records whose expires_at is in the past. continuity_prune_stale dry-runs by default and only mutates records when called with dry_run: false.
Enables optional vector/hybrid retrieval.
Default: false.
Supported aliases:
none
sqlite-hash
hash
sqlite-json
sentence-transformers
Unknown values are clamped to none.
sqlite-hash is dependency-free and stores deterministic hashed vectors in SQLite. sentence-transformers uses local neural embeddings and requires the embeddings extra or equivalent packages.
Model label or sentence-transformers model name.
Examples:
hash-v1
BAAI/bge-small-en-v1.5
Batch size for embedding/indexing operations.
Default: 32.
Whether embedding vectors should be normalized when the backend supports it.
Default: true.
Enables query-record reranking after candidate retrieval.
Default: false.
Supported aliases:
none
sentence-transformers
cross-encoder
Cross-encoder model name. Common local default:
BAAI/bge-reranker-base
Minimum reranker score for direct automatic selection.
Default: 0.15.
Score weight added from reranker results.
Default: 2.0.
Maximum candidate count to rerank.
Default: 20.
The association graph is additive: it can rescue implicit context related to strong direct or activation-only seeds, but retrieved graph neighbors are lower-confidence context.
Default: true.
Minimum deterministic overlap score for creating an association edge between records.
Default: 0.25.
Maximum neighbor edges considered per source record during graph traversal.
Default: 20.
Activation decay applied per hop.
Default: 0.6.
Minimum activation required for an associated neighbor to be eligible for injection.
Default: 0.3.
Maximum traversal depth.
Default: 2.
The retrieval_gates block controls conservative filters that run after candidate generation and before final context injection. Defaults are intentionally precision-first: it is better for continuity to return no memory than to inject a plausible but wrong memory.
Use continuity_debug_retrieval(query="...", limit=5) before changing these values. Look at rejected[].rejected_reason, rejected[].tuning_advice, selected[].reasons, scores, and activation_trace.cut_neighbors[].reason; tune one setting at a time and then rerun regression/demo queries.
Example debug excerpt:
{
"id": "r_soft_debug_match",
"rejected_reason": "below_min_base_relevance",
"base_relevance": 1.35,
"tuning_advice": {
"config": "retrieval_gates.min_base_relevance",
"try": "If this record should match, lower this threshold slightly or add stronger tags/entities. If weak one-word matches are noisy, raise it."
}
}That means the record had some lexical/metadata evidence, but not enough to pass the direct retrieval gate. For a true positive, add better tags/entities or lower retrieval_gates.min_base_relevance slightly; for a noisy one-word match, raise it.
Minimum lexical/metadata evidence required before a candidate can survive direct retrieval. This uses FTS, vector score, tag matches, entity matches, and text matches before scope, confidence, importance, reranker, or graph boosts.
Default: 0.35.
Plain English:
- If relevant records are rejected as
below_min_base_relevance, try lowering this slightly, for example0.35 -> 0.25. - If vague queries pull barely related records because of one weak word match, raise it, for example
0.35 -> 0.5. - Do not use this to fix reranker misses; tune
reranker.min_scoreorstrong_personal_anchor_reranker_bypassinstead.
Minimum raw vector similarity before a vector hit is considered at all.
Default: 0.05.
Plain English:
- If paraphrases never even appear as candidates in debug output, lower this slightly.
- If lots of semantically mushy vector hits appear in debug output, raise it.
- This only matters when vectors are enabled.
Minimum weighted vector score required when a candidate has no FTS/tag/entity/text anchor.
Default: 2.75.
Plain English:
- If vector-only paraphrases are useful but show
weak_vector_only_without_anchor, lower this a little. - If unrelated memories appear because neural similarity is too generous, raise this.
- Prefer adding better tags/entities to important records before lowering this globally.
Rejects vector-only candidates that fall below min_vector_only_relevance.
Default: true.
Plain English:
- Disable only for experiments or very small, clean corpora.
- If you disable this and see noisy memories, turn it back on and tune
min_vector_only_relevanceinstead.
Prevents obvious domain collisions such as software “framework” questions retrieving Framework laptop/desktop memories.
Default: true.
Plain English:
- If debug output says
facet_incompatible_candidatefor a record that really is relevant, either improve the record tags/entities or temporarily disable this to confirm the gate is the cause. - If polysemous words cross domains, keep this enabled.
Keeps continuity/memory-system records out of ordinary non-memory queries, and keeps generic memory words from matching unrelated records.
Default: true.
Plain English:
- If normal questions retrieve continuity implementation/status records, keep this enabled.
- If an actual memory-system query rejects relevant memory records as
memory_system_record_without_meta_query, make the query more explicit (continuity,retrieval,memory provider) before disabling this.
Suppresses retrieval for vague follow-ups with no durable anchor, such as “ok”, “how about now”, or “can you test and see if it is better now”.
Default: true.
Plain English:
- If vague follow-ups inject stale project context, keep this enabled.
- If short but meaningful local commands in your workflow are being rejected as
low_information_direct_query, add a stronger project/entity word to the query or lower/disable this only after checking debug output.
Requires active project-state records to match a non-generic query anchor before selection.
Default: true.
Plain English:
- If old project status appears for generic “what now?” style queries, keep this enabled.
- If a project-status record is relevant but rejected as
weak_project_status_candidate, add specific tags/entities or lower this gate only for a trusted, scoped deployment.
Prevents first-person/personal queries from drifting into unrelated project status, and prevents broad graph edges from crossing between personal facts and project records without anchors.
Default: true.
Plain English:
- If “what do I prefer?” retrieves random project roadmap records, keep this enabled.
- If a personal query intentionally needs project context, make the query include the project/entity name before disabling this.
Requires rejected reranker candidates to have a strong non-generic query anchor before they can seed association-graph traversal.
Default: true.
Plain English:
- If association graph rescue is too quiet and debug shows
weak_anchor_activation_seedfor a relevant candidate, improve tags/entities first; then consider disabling this for experiments. - If graph traversal fans out through old broad associations, keep this enabled.
Lets strongly tagged/entity-matched personal records survive a low reranker score. This compensates for generic cross-encoders that often under-score short personal setup/preference facts.
Default: true.
Plain English:
- If personal facts like vehicle/setup/preferences are rejected as
below_min_reranker_scoredespite strong tag/entity matches, keep this enabled. - If personal records bypass the reranker too aggressively, disable this or improve tags/entities so only truly strong anchors match.
- Run
continuity_debug_retrievalfor a failing positive query and a nearby negative/noise query. - Identify the rejection reason or noisy selected reason.
- Change one value by a small amount.
- Rerun the same positive and negative queries.
- Run
./scripts/demo-retrieval.sh --quietand the retrieval regression tests before keeping the change.
Recommended command after tuning:
uv run pytest tests/test_retrieval.py tests/test_retrieval_public_regressions.py tests/test_association_graph.py -q
./scripts/demo-retrieval.sh --quietThe provider always returns a compact continuity guidance block from system_prompt_block() when initialized. Pinned records are enabled by default as the Tier 0 memory layer. For small/local models that need continuity operating guidance but should not receive any always-on memory summaries, set pinned_system_prompt.enabled: true and pinned_system_prompt.include_records: false.
Whether the pinned system-prompt feature is enabled. The compact continuity operating guidance remains available when the provider is initialized; record summaries are controlled separately by include_records.
Default: true.
Whether high-importance continuity records are included after the guidance block. Set this to false for guidance-only mode: the model learns how to use continuity, but no durable memory summaries are injected into the system prompt.
Default: true.
Character budget for pinned records only, not the provider guidance block.
Default: 1200.
Maximum number of records to pin.
Default: 5.
Record kinds eligible for pinning.
Default: [user_preference, project_convention].
Minimum record importance required for pinning.
Default: 0.75.
The dream cycle is offline/report-oriented memory maintenance. It checks stale candidates, recomputed quality-warning inventory, vector index parity, and association-graph edge inventory. Report mode does not mutate records; archive modes only soft-archive and never hard-delete.
Whether scheduled dream checks are enabled. Scheduling is evaluated during provider initialization and/or turn sync when the matching run_on_* option is enabled; Hermes does not run plugin code on an independent wall-clock timer by itself.
Default: false.
Local HH:MM time after which a scheduled dream can run once per day.
Default: 03:30.
IANA timezone for dream.schedule.
Default: UTC.
Accepted values:
report
archive-safe
archive-all-candidates
Start with report. Archive modes soft-archive deterministic stale candidates and still never hard-delete.
Default: report.
Whether the provider should check the schedule during initialization or turn sync.
Default: false for both.
Report-only example:
memory:
provider: continuity
continuity:
dream:
enabled: true
schedule: "03:30"
timezone: "UTC"
mode: "report"
run_on_turn: true
run_on_initialize: trueContinuity stores data at:
$HERMES_HOME/continuity/continuity.db
Main tables:
context_recordscontext_records_ftssession_observationsretrieval_eventscontext_record_vectorscontext_record_skill_linkscontext_record_association_edges