Skip to content
Magnus Hedemark edited this page Jun 10, 2026 · 1 revision

Lore

Eras

The history of hermes-cashew can be divided into several distinct phases, spanning from April 19, 2026 to the present day.

Scaffolding and discovery (April 19–20, 2026)

The repository was created on April 19, 2026 with an initial commit containing LICENSE and README.md. The first week focused on project scaffolding:

  • April 19CLAUDE.md created establishing the plugin architecture, dual-load-path layout, and testing conventions.
  • April 19pyproject.toml added with hatchling backend and cashew-brain git+SHA dependency. Dual plugin.yaml manifests created (root + nested).
  • April 19CashewMemoryProvider stub added with root re-export shim. Test infrastructure established with conftest.py, offline guards, and agent.memory_provider ABC stub.
  • April 19 — CI pipeline (tests.yml) added.
  • April 20 — Config schema helper module (config.py) added. Provider lifecycle methods (initialize, shutdown, get_config_schema, save_config) implemented.
  • April 20 — Sync worker lifecycle implemented: bounded queue.Queue(maxsize=16) + single non-daemon worker thread with sentinel shutdown pattern.
  • April 20 — Tool schemas registered (cashew_query), handle_tool_call routing implemented, prefetch() method added with eager ContextRetriever lifecycle. MemoryManager stub and E2E tests created.

Upstream integration (April 21, 2026)

  • April 21 — Phase 8 schema management: _ensure_db_schema restructured with per-table transactions, _create_vec_embeddings helper with guarded extension loading, _add_missing_columns with idempotent ALTER TABLE.
  • April 21 — Version v0.1.0 through v0.1.5: auto-create Cashew DB schema, fix derivation_edges.timestamp schema migration, accept session_id in sync_turn(), handle missing confidence column in edge migration.
  • April 21 — v0.2.0 released: 31-key config with env overrides and domain helpers, sqlite-vec support, PyPI dependency.
  • April 21 — PyPI trusted-publishing release workflow added.

Stop Vibe-Coding, Start Integrating (May 12, 2026) — v0.3.0

The defining moment in the project's history. The team caught themselves reimplementing large swaths of cashew-brain instead of wrapping it. The result was a 150+ line gutting of custom retrieval code in favor of delegating to upstream's retrieve_recursive_bfs(). The commit that changed everything:

  • Removed _retrieve_with_vec, _retrieve_bfs, _retrieve_keyword, _score_nodes, _apply_filters, _vec_available, _get_query_embedding.
  • Fixed a critical bug where vec_embeddings used SQLite rowid to look up thought_nodes.id (SHA hashes) — semantic search always returned 0 results.
  • Switched cashew-brain from git+SHA pin to >=1.1.0,<2.0.0 on PyPI.
  • Added _keyword_search SQL LIKE fallback and _migrate_vec_embeddings auto-migration.
  • Phase 11 E2E tests: full lifecycle and 4-thread concurrent DB stress test.

LLM integration via auxiliary.memory (May 12, 2026) — v0.4.0

The plugin learned to talk to LLMs. The llm_aux_role config key (set to "memory") wired upstream LLM features through Hermes' own auxiliary.<role> config convention. All 9 open issues were reconciled against the thin-adapter architecture. Open source conventions established: CONTRIBUTING.md, issue templates, PR template, DCO sign-off.

Privacy controls (May 12, 2026) — v0.5.0

exclude_tags filtering added to all retrieval paths, allowing users to tag sensitive nodes and filter them out of query results. The final open issue from the original milestone plan was resolved.

Think cycles and sleep cycles (May 12, 2026) — v0.6.0/v0.7.0

  • v0.6.0 (published as v0.7.0) — Think cycles: think_interval config key (default 10) controlling how many sync turns pass between upstream think_cycle() calls.
  • v0.7.0 — Sleep cycles on shutdown: upstream run_sleep_cycle() called when provider shuts down, if LLM is wired and sleep_cycles is True.
  • v0.7.1 (May 12) — Sleep cycle moved from shutdown() to on_session_end(), the correct lifecycle hook per Hermes memory provider docs.
  • v0.7.2 (May 12) — Spec compliance: worker thread changed from daemon=False to daemon=True. README added to plugin directory.
  • v0.7.3 (May 12) — sqlite-vec macOS fix: switched from conn.load_extension("vec0") to sqlite_vec.load(conn).

Sleep cycle removed (May 13, 2026) — v0.7.4

The upstream run_sleep_cycle() proved too heavyweight for synchronous lifecycle hooks. With 6K+ nodes and 59K+ edges, it computed a full N×N embedding similarity matrix (36M+ comparisons) with Bron–Kerbosch clique detection — taking hours and accumulating overlapping instances. The sleep cycle was removed from both on_session_end() and shutdown(), and sleep_cycles became a no-op.

Refactored sleep cycle (May 14, 2026) — v0.8.0

A ground-up refactored sleep cycle replaced the upstream O(N²) implementation. The new sleep_refactor.py used a nine-phase pipeline with vectorized cross-linking (numpy + batched DB writes), processing 7,100 nodes in ~4 seconds vs hours upstream. The sleep cycle was re-enabled in on_session_end(), work-capped at 2,000 nodes per cycle.

Plugin namespace and embedding fixes (May 14, 2026) — v0.8.1/v0.8.2

  • v0.8.1 — Embedding gap closure: _embed_orphans() was inserting into the wrong column (model_name instead of model), so orphaned nodes were never embedded and the gap never closed.
  • v0.8.2 — Hermes hermes_plugins synthetic namespace fix: root __init__.py flat-entry detection missed Hermes 0.8.8+'s new namespace convention.

First-load bootstrap (May 15, 2026) — v0.9.0

cashew.json auto-generated with defaults on first load. llm_aux_role defaulted to "memory" so LLM extraction works out of the box. is_available() returns True when deps are present, even without a config file. The on_pre_compress hook added for forest-level insight extraction (topic shifts, framing changes, implicit decisions).

Cross-source linking, queue prefetch, and background dreams (May 17, 2026) — v0.10.0

  • Cross-source linking: pairs sharing the same source_file are skipped during cross-linking, reducing BFS graph noise.
  • Edge cap (MAX_EDGES_PER_CYCLE = 100K): prevents runaway cross-linking on dense node batches.
  • Out-degree selection: sleep cycle prioritizes nodes with fewest existing edges, rebalancing the graph over time.
  • queue_prefetch: background daemon thread ingests assistant responses and pre-warms Cashew context for the next turn.
  • Background dream dispatch: Phase 8 (dream generation) and Phase 9 (orphan embedding) run in a daemon thread instead of blocking the caller. /new session latency dropped from ~118s to ~20s.
  • Sync queue drain removed from on_session_end() — the sync worker is non-daemon and keeps running across session boundaries with WAL mode handling concurrent DB writes.

GC grace period and threshold tuning (May 22, 2026) — v0.10.1

  • GC grace period (GC_GRACE_DAYS = 7): garbage collection now skips nodes created within the last 7 days, preventing systematic memory loss for single-reference knowledge (reported by @boriken72).
  • Decayed-node filter in _keyword_search(): keyword fallback now excludes decayed nodes.
  • Default CROSS_LINK_THRESHOLD raised from 0.70 to 0.78.

Cron-based scheduling and dynamic embeddings (May 27, 2026) — v0.10.2

  • Sleep cycle migrated from on_session_end() to Hermes cron job — the plugin registers a no_agent cron job at initialize() running on a configurable schedule (default: every 12 hours). /new returns instantly.
  • is_available() contract restored: no longer probes hermes_constants.get_hermes_home() — returns ContextRetriever is not None when _hermes_home is unknown.
  • PRAGMA busy_timeout=5000 added to both SQLite connections in sleep_refactor.py to prevent database is locked failures.
  • fcntl.flock(LOCK_EX | LOCK_NB) advisory lock prevents concurrent sleep cycles across Hermes processes.
  • vec_embeddings dimension resolved dynamically: replaced hardcoded float[384] with dynamic detection from the configured embedding model, enabling thenlper/gte-large (1024-dim) support.
  • Embedding model propagated to upstream: configured embedding_model passed to end_session() and embed_nodes() calls.
  • Repo-root cruft removed (.planning/ directory and None file).

Ongoing refinements (June 2026)

  • June 1 — Config fix: resolve provider base_url from _PROVIDER_BASE_URLS map.
  • June 5 — Sleep cycle cron job fix: ensure it runs despite lifecycle churn.
  • June 7 — sqlite-vec made a standard (not optional) dependency. Root __init__.py made a proper re-export shim. Stale state self-healing on startup with interpreter-exit race handling.
  • June 8 — Stale test for sqlite-vec unavailability removed.

Longest-standing features

These features have been in the codebase since the earliest versions:

Feature Since Description
CashewMemoryProvider class v0.1.0 (April 19) Core provider implementation — survived multiple rewrites
Dual layout (__init__.py root re-export + nested module) v0.1.0 (April 19) Flat-entry and nested loader paths
plugin.yaml manifests (root + nested) v0.1.0 (April 19) Hermes plugin registration
Config schema with env overrides v0.2.0 (April 21) 31-key config originally, now 37 keys
ContextRetriever-based retrieval v0.1.0 (April 20) Cashew's two-step retrieval API
Sync worker with bounded queue v0.1.0 (April 20) Non-blocking sync_turn() pattern
Profile isolation via hermes_home v0.1.0 (April 20) All paths scoped under hermes_home

Deprecated and removed features

Feature Removed in Replacement
Custom _ensure_db_schema v0.3.0 (May 12) Upstream core.db.ensure_schema()
Custom retrieval pipeline (_retrieve_with_vec, _retrieve_bfs, _retrieve_keyword, _score_nodes, etc.) v0.3.0 (May 12) Upstream retrieve_recursive_bfs()
confidence_threshold config key v0.3.0 (May 12) Removed (upstream dropped the column)
Upstream run_sleep_cycle() (O(N²)) v0.8.0 (May 14) Refactored sleep_refactor.py
Sync queue drain in on_session_end() v0.10.0 (May 17) WAL mode handles concurrent writes
Sleep cycle in on_session_end() v0.10.2 (May 27) Hermes cron job (no_agent cron)
Hardcoded float[384] embedding dimension v0.10.2 (May 27) Dynamic dimension detection
Hardcoded all-MiniLM-L6-v2 embedding model v0.10.2 (May 27) Configurable embedding_model

Major rewrites

v0.3.0 — "Stop Vibe-Coding, Start Integrating" (May 12, 2026)

The most consequential rewrite. 150+ lines of custom retrieval code gutted and replaced with thin delegation to upstream cashew-brain. The project pivoted from building its own retrieval pipeline to being a thin Hermes-to-Cashew adapter.

v0.8.0 — Sleep cycle refactored (May 14, 2026)

A ground-up rewrite of the sleep cycle replacing upstream's O(N²) implementation (hours at 7K nodes) with a vectorized numpy-based nine-phase pipeline (4 seconds at 7K nodes). Created sleep_refactor.py as the new core module.

v0.10.0 — Background dream dispatch (May 17, 2026)

Dream generation and orphan embedding moved to a daemon thread, cutting /new session latency from ~118s to ~20s.

v0.10.2 — Sleep cycle migration to cron (May 27, 2026)

The sleep cycle was moved out of session lifecycle hooks entirely and into a Hermes cron job running every 12 hours. /new became effectively instant.

Growth trajectory

Date Version Milestone Est. codebase size
April 19 v0.1.0 Initial scaffolding, stub provider, tests ~500 lines
April 20 v0.1.0 Sync worker, tool schemas, prefetch, config ~1,500 lines
April 21 v0.2.0 Schema management, sqlite-vec, 31-key config ~3,000 lines
May 12 v0.3.0 Upstream integration (gutting custom retrieval) ~2,800 lines (down)
May 12 v0.4.0–v0.7.4 LLM integration, privacy, think/sleep cycles ~4,000 lines
May 14 v0.8.0 Refactored sleep cycle (sleep_refactor.py) ~5,500 lines
May 15 v0.9.0 First-load bootstrap, on_pre_compress ~6,500 lines
May 17 v0.10.0 Cross-source linking, queue_prefetch, background dreams ~8,000 lines
May 27 v0.10.2 Cron-based scheduling, dynamic embeddings ~9,000 lines
June 10 current Ongoing refinements 9,507 lines

The codebase grew from ~500 lines to over 9,500 lines in just 52 days, with one significant contraction at v0.3.0 when 150+ lines of custom retrieval were removed in favor of upstream delegation. The test suite grew alongside, now accounting for ~5,700 lines (60% of the Python codebase).

Clone this wiki locally