All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
-
Multi-client MCP config-pack generator (#659). Added
contextweaver mcp generate-configsto render client recipe files (copilot_mcp.json,cursor_mcp.json,claude_desktop_config.json,claude_code_mcp.json) from one canonicalmcp serve --configinput. The command reusesmcp serveconfig validation, supports target selection, fails closed on unknown/invalid target values, blocks overwriting unless--force, emits target-specific compatibility warnings, and produces deterministic JSON artifacts suitable for committing. Added CLI tests for generation behavior and fixture-shape pinning. -
Supply-chain & security CI hardening (#443, #689, #690, #691, #692, #468, #552). A coordinated security-posture pass under the supply-chain hardening umbrella (#443):
- CodeQL code scanning (
.github/workflows/codeql.yml) with thesecurity-extendedquery pack, on PR,main, and a weekly schedule (#689). - pip-audit dependency scanning (
.github/workflows/pip-audit.yml): gating on the core runtime dependency set, report-only for the heavier dev extra (#689). - OpenSSF Scorecard analysis (
.github/workflows/ossf-scorecard.yml) with results published to code scanning and a README badge; the OpenSSF Best Practices badge application is tracked as a manual step (#552). - Dependabot (
.github/dependabot.yml) weeklypipandgithub-actionsupdates, grouped to limit noise (#443). - Release-integrity gate in
publish.yml(#468): averifyjob asserts the release tag matches thepyproject.tomlversion, runs the test suite, andtwine checks the built distribution before the publish job runs. - Build-provenance attestations for released artifacts via
actions/attest-build-provenance(#690). security-policy-checkgate (scripts/check_security_policy.py, wired intomake ciandci.yml): fails whenSECURITY.md's supported-version table drifts from the package version or links a missing doc. Refreshed the supported series to0.16.x(#691).- Security tooling runbook (
docs/security_tooling.md) documenting the triage SLA, ownership, and the false-positive exception process (#692).
- CodeQL code scanning (
-
scripts/check_readme_version.pygained a--print-versionflag so the release-integrity gate reads the package version through the same single source of truth as the drift guard.
-
SSE transport for MCP gateway and proxy (#694).
McpGatewayServer.run_sse()andMcpProxyServer.run_sse()bind the existing server surfaces onto an HTTP/SSE endpoint using the MCP SDK'sSseServerTransportanduvicorn. Themcp serveCLI gains--transport stdio|sse,--host, and--port(config-file compatible). Default remains stdio for backward compatibility; SSE is opt-in. SSE enables the MCP SDK's DNS-rebinding protection (off by default in the SDK), scoping theHost/Originallowlist to the bound--host/--port. A transport & compatibility matrix is added todocs/integration_mcp.mddocumenting tested client versions (Claude Desktop, Claude Code, VS Code Copilot, Cursor) and their supported transports. No new dependencies — starlette and uvicorn are already pulled by the MCP SDK. -
Memory consolidation engine (#498, #679, #680, #681, #682, #683). New
contextweaver.context.consolidate(...)distills episodic memory into durable, deduplicated, provenance-stamped facts. The deterministic core clusters similar episodes (cluster_episodes, #679), promotes clusters that meetConsolidationPolicythresholds (min_occurrences/min_sessions) intoPromotedFactrecords carrying full source provenance and the maximum source sensitivity (promote_clusters, #680), and reports entries past the decay horizon without deleting them — the stores are append-only (decay_episodes/decay_facts, #681). An optional, fail-closedcall_fnmay refine a fact's canonical text, rejecting any completion that introduces ungrounded tokens or a negation absent from the source notes (#682).consolidate(..., apply=True)upserts the promoted facts with content-addressed IDs, so re-running over an unchanged store is a no-op (idempotent). Results are returned as aConsolidationReport(serialisable viato_dict/from_dict). New public surface incontextweaver.context:consolidate,cluster_episodes,promote_clusters,decay_episodes,decay_facts,ConsolidationPolicy,ConsolidationReport,PromotedFact,EpisodeCluster. A newcontextweaver consolidateCLI subcommand runs the pipeline over JSON-serialised stores. Quality is measurable offline viacontextweaver.eval.evaluate_consolidation→ConsolidationEvalReport(precision / coverage + dedup ratio, #683). Pure stdlib; no new dependency. -
Package metadata drift guard (#473). The existing
readme-version-checknow also verifies that Python version classifiers inpyproject.tomlmatch the gating CI matrix, preventing PyPI metadata from lagging the tested support range. Package metadata now advertises Python 3.13 support, removes the long-expired no-op[cli]extra, and drops reserved[ann]/[graph]extras that installed dependencies without activating any runtime code. -
Routing-scale index cache + profiler (#543, #624, #685, #684, #686). New
contextweaver.routing.RoutingIndexCache+CachedRetrieverpersist and reuse the fitted first-stage retriever index — the dominant cost of the firstroute()call on a large catalog — keyed by a deterministic corpus fingerprint. The cache has an in-process LRU layer (reuse acrossRouterinstances in one process, folding in the cross-call reuse of #543) and an optional on-disk layer (RoutingIndexCache(directory=...), deterministic JSON written atomically) that survives process restarts (#624). Opt in viaRouter(graph, items=items, retriever=CachedRetriever(TfIdfRetriever(), cache)); warm loads are byte-identical to a cold fit, so routing quality and determinism are unchanged. The cache never raises into the routing path — a corrupt or version-incompatible payload is treated as a miss and re-fitted. Addedbenchmarks/routing_scale.py+make benchmark-routing-scale(non-gating) which profiles routing up to 10k tools and writes the bottleneck report atdocs/benchmarks/routing-scale.md(#684), and a routing-quality guardrail suite (tests/test_routing_quality_guardrails.py) pinning the recall floor and cache transparency over the gold set (#686). No new dependency. Seedocs/benchmarks/routing-scale.md. -
HTTP sidecar: language-agnostic route/compact API (#427, #674/#675/#676/#677/#678). New
contextweaver serve-apiexposes the deterministic router and the context firewall over a small, versioned HTTP/JSON API so non-Python agents can use them without embedding Python:POST /v1/route(tool routing),POST /v1/compact(tool-result compaction), and an unauthenticatedGET /v1/healthliveness probe. Built on the Python standard library (http.server) with no new dependency, reusing the same syncRouterandcompact_tool_resultfacade as the in-process API. New public surface incontextweaver.adapters:SidecarApp+SidecarConfig(transport-free dispatch with optional bearer-token auth, per-client rate limiting, body-size cap, and typedSidecarErrorresponses),serve_api/make_sidecar_server, and theRouteRequest/RouteResponse/CompactRequest/CompactResponsecontract dataclasses (SIDECAR_API_VERSION). Versioned JSON Schemas + example payloads ship underschemas/sidecar/v1/; clients inexamples/sidecar_demo.py(Python, also runs undermake example) andexamples/sidecar/client.ts(TypeScript, dependency-free). A non-gatingmake sidecar-smokeCI step drives the transport in-process. Seedocs/sidecar.md. -
contextweaver verifysubcommand (#657). New non-gateway verification mode giving library-first adopters a fast, deterministic, network-free smoke test of core functionality. Checks import path,ContextManagerinstantiation, a minimal context build, token counting, and routing. Outputs a Rich table for humans (--jsonfor CI/automation) with a clear pass/fail exit code and actionable fix hints. Documented indocs/quickstart.md. -
Puppetmaster integration pattern (#416). New
docs/integration_puppetmaster.mdshows how contextweaver consumes Puppetmaster-style job artifacts, worker summaries, logs, and follow-up reads without dumping raw artifacts into model context. Covers artifact summary ingestion, drilldown via handles/selectors, route/answer phase budgeting over job history, and explicit boundaries (in: context consumer; out: job supervisor / worker orchestrator). -
Gateway resource & prompt runtime (#669 / #670). New
PrimitiveGatewayRuntime(+ thePrimitiveUpstreamprotocol) extends the gateway's bounded-choice routing and context-firewall treatment from tools to MCP resources and prompts (#555). Resources/prompts are modelled asSelectableItems (kind="resource"/"prompt") so they reuse the routingCatalog/Router/ChoiceCardmachinery; each kind routes in its own index while sharing oneContextManager(artifact store + firewall +tool_view) with the tool runtime. New convertersmcp_resource_to_selectable/mcp_prompt_to_selectableand read/get envelope wrappers live incontextweaver.adapters.mcp_primitives; declared prompt arguments become anargs_schemasoprompt_getvalidates inputs liketool_execute. TheSelectableItem/ChoiceCardkindset now includesresourceandprompt. Four new gateway meta-tools —resource_browse/resource_read/prompt_browse/prompt_get(contextweaver.adapters.mcp_gateway_primitives) — expose the bounded-choice surface, andMcpGatewayServeradvertises and dispatches them over stdio when constructed with aprimitive_runtime=. -
Unified cross-primitive identity & collision policy (#671). New
contextweaver.routing.primitive_idis the single source of truth for identifying MCP tools, resources, and prompts in one sharedCatalog(groundwork for routing resources/prompts through the gateway, #555). Tools keep their bare canonicaltool_id; resources and prompts get disjoint-by-construction ids via a reservedkind::prefix (resource::fs:readme#ab12cd34,prompt::gh:summarize#deadbeef). Stable per-kind shape hashes (compute_resource_hash8over the URI;compute_prompt_hash8over name + sorted argument names) and a deterministic~Ncollision policy (resolve_collisions) round out the surface. Documented indocs/gateway_spec.md§9. -
Resources/prompts reachable end-to-end via the gateway (#669 / #670 / #672 / #673). Three concrete
PrimitiveUpstreamadapters now ship incontextweaver.adapters.mcp_primitive_upstream—StubPrimitiveUpstream(in-process, for tests/CLI/air-gapped CI),McpClientPrimitiveUpstream(wraps a connected MCPClientSession), andMultiplexPrimitiveUpstream(multi-server fan-out) — mirroring the toolmcp_upstreamtrio; per the protocol contract they raise transport errors for the runtime to classify.contextweaver mcp serve --gatewaynow exposes the four resource/prompt meta-tools when the catalog is a snapshot object declaringresources/promptsalongsidetools(tools-only catalogs stay unchanged), sharing the tool runtime'sContextManager.PrimitiveGatewayRuntimegainsresource_ids()/prompt_ids()accessors mirroringProxyRuntime.list_tool_ids(). The mixed-primitive context-shaping benchmark is runnable viamake benchmark-primitives, anddocs/gateway_spec.md§9.4–§9.5 document the request flows and the serve/catalog wiring. Malformed snapshot-catalog primitive entries (non-dict, or missing the requireduri/nameidentity field) are now skipped with a warning instead of being silently dropped, so a mistyped resource/prompt entry surfaces in the serve logs. -
Stable error codes + remediation hints (#635). Every
ContextWeaverErrorsubclass now carries a frozen, machine-readablecode(e.g.CW_CONFIG) so programs can branch on failures without string-matching, plus an optionalhint(with a class-leveldefault_hintfallback).str(exc)renders[code] message (hint: …), so CLI error output surfaces both automatically. Codes are golden-listed intests/test_exceptions.py(a rename or a code-less new exception fails CI). -
Error reference page (#637). New
docs/errors.mddocuments every exception — stable code, raising modules, common causes, and the fix — with a code index table; added to the mkdocs nav, cross-linked from the troubleshooting guide, and included inllms.txt/llms-full.txt. -
Runtime deprecation machinery (#517). New internal
contextweaver._deprecationmodule —warn_deprecated(...), a@deprecateddecorator, and a single registry surfaced viaactive_deprecations()— emitsDeprecationWarnings with consistent, actionable wording ("deprecated since X, removal in Y, use Z instead"). Every message starts withcontextweaver deprecation:so CI can escalate the project's own deprecations to errors (newfilterwarningsentry inpyproject.toml) without touching third-party warnings. Documented indocs/stability.mdand the new Upgrading page, with a "Deprecating an API" workflow indocs/agent-context/workflows.md. -
Upgrade guide (#616). New
docs/upgrading.mdstates the 0.x versioning and deprecation policy and carries the live inventory of active deprecations plus per-release "action required" notes.
- Contributor workflow & build-tooling hardening (#705, #706, #709, #710,
#711, #712).
Makefiletargets now invoke$(PYTHON)(defaultpython3, overridable viamake <target> PYTHON=...) so the documented commands run on environments that ship onlypython3(#712).- New
make floor-depsandmake tool-smoketargets (bundled asmake ci-full) reproduce locally the two gating CI jobsmake cicannot mirror — lowest-direct dependency-floor resolution and the wheel / entry-point smoke; only the macOStool-run-smokecell stays CI-only (#710). - A
.gitattributesmarksCHANGELOG.md,llms.txt, andllms-full.txtasmerge=unionso concurrent PRs stop hand-resolving conflicts in these append-only / generated files; thedrift-checkgate still verifies the committed output onmain(#709). docs/agent-context/labels.mdis rewritten to match the live label taxonomy (priority:,complexity:,area/,type:families), and thedocs_improvement/integration_requestissue templates now apply the canonicalarea/docs/integrationslabels instead of the stale colon-prefixed forms (#711).contextweaver verifypinsheuristic_counter()in its manager and build checks, so the network-free guarantee no longer depends onContextManager's default estimator (#705); the CLI failure path (non-zero exit + fix-hint rendering) is now covered by tests (#706).
-
Pre-1.0 legacy compatibility shims (#642). The following now emit a
DeprecationWarning(behavior unchanged; nothing removed yet — see the Upgrading inventory for replacements and the 1.0 removal milestone):RouteResult.debug_trace→ useRouteResult.trace.RouteTrace.to_legacy_dicts()→ use the structuredRouteTracefields.- the
Router(scorer=...)constructor argument → useretriever=orscorer_backend=.
The
contextweaver.ToolCard/contextweaver.types.ToolCardalias (→ useSelectableItem),ChoiceGraph.build_meta, and the pre-#190ArtifactRefwrite path are recorded as documentation-only deprecations in the upgrade guide.ToolCardstays a plain alias because the only modules it could warn from (pure-datatypes.py, re-export-only__init__.py) are barred from side effects by hard invariants; the others remain on internal serialization paths.
- Binary MCP resource reads are no longer corrupted (#671 review).
mcp_resource_read_to_envelopenow base64-decodes a resource part'sblobback to its original bytes before persisting it, instead of storing the base64 text bytes — sotool_viewdrilldown on real binary resources stays byte-accurate. Malformed (non-base64) blobs fall back to their raw bytes. *_browserejects invalidtop_kcleanly (#671 review).PrimitiveIndex.browsenow validatestop_kand returns a structuredGatewayError(ARGS_INVALID)for non-integer or non-positive values, instead of letting a bad type reachmake_choice_cardsand raiseTypeErroracross the meta-tool boundary.- Clarified collision-policy determinism &
~Nid status (#671 review).resolve_collisionsdocs anddocs/gateway_spec.md§9 now state the assignment is deterministic for a given catalog order (index-based, not order-independent), and that the~N-suffixed form is an opaque catalog key outside the §1.1 grammar (it does not round-trip throughparse_tool_id). Collision tests now use canonical 8-hex-char ids.
-
Structured route→select contract and shortlist composition controls (#515, #479, #516, #509). A focused hardening of the boundary where a model picks a tool from a routed shortlist:
- Constrained-selection schemas (#515).
RouteResult.selection_schema(...)(andcontextweaver.selection_schema) renders the routed candidate IDs as a JSON-Schemaenum, withjson_schema/openai/anthropicprovider variants, so a model can be forced to pick only a routedtool_idat generation time. - Validated selection contract (#479).
RouteResult.validate_selection(...)(andcontextweaver.validate_selection) returns a typedSelectionValidation(accepted/repaired/rejected) for a returned ID, with deterministic repair (whitespace → case-fold → unique prefix; ambiguous matches are rejected, never guessed).RouteResult.to_routing_decisionnow validates the selection, stores the resolved canonical ID, and records the outcome undermetadata["contextweaver"]["selection"]. - First-class, capping-immune safety field (#516).
ChoiceCardgains asafetyfield (""/"read_only"/"destructive") derived from the item's safety tags, and the §2.1 five-tag cap now reservesdestructive/read-onlytags first so a safety marker can no longer be alphabetically evicted from the model-facing surface. - Shortlist composition controls (#509).
Router.route(...)acceptspin_ids(always-include items that occupy the first slots regardless of relevance) andnamespace_quota(a per-namespace cap on non-pinned items), viarouting.filters.compose_shortlist. Unset, composition is byte-identical to the previoustop_ktruncation.
- Constrained-selection schemas (#515).
-
Source-to-catalog adapters: OpenAPI, Agent Skills, and Microsoft Agent Framework (#546, #545, #430). Three new adapters built on the shared conversion toolkit (
adapters/_framework_common.py), extending routing to capability sources beyond agent-framework tools:- OpenAPI adapter (#546).
adapters.openapiconverts an OpenAPI 3.0/3.1 document (dict, JSON, or YAML) into aSelectableItemcatalog — one item per operation (openapi_operation_to_selectable,openapi_spec_to_catalog,load_openapi_catalog).parameters+requestBodycompose into a singleargs_schema; local$refs resolve (external refs raise); HTTP methods map to read-only / destructive safety tags mirroring the MCP adapter. contextweaver routes — it never makes the HTTP call. No extra required. - Agent Skills adapter (#545).
adapters.agent_skillsloadsSKILL.mdskill directories into the catalog askind="skill"items using only their frontmatter (skill_to_selectable,load_skills_catalog);SkillBodySourcehydrates the full Markdown body and bundled resources lazily on selection, mirroringrouting.hydration.SchemaSource. No extra required. - Microsoft Agent Framework adapter (#430).
adapters.agent_frameworkconvertsAIFunctiontools to a catalog and threadChatMessages toContextItems with function-call → result parentage (agent_framework_tools_to_catalog,from_agent_framework_thread);[agent-framework]extra for live loading.
- OpenAPI adapter (#546).
-
Framework adapter expansion + shared conversion toolkit (#454, #502, #501, #547, #401). A coherent pass over the
adapters/tool-catalog layer:- Shared conversion toolkit (#454). New private
adapters/_framework_common.pycentralises the mechanics the framework adapters previously each re-implemented —infer_namespace,strip_namespace_prefix,coerce_schema_dict,collect_tags,require_name_description. The CrewAI, Agno, smolagents, Pydantic AI, and ChainWeaver adapters now delegate to it with byte-identical behavior, so a convention change is one edit instead of up to five. - LangChain adapter (#502).
adapters.langchainconvertsBaseToolinstances (or the plain-dict shape) into aSelectableItemcatalog (langchain_tool_to_selectable,langchain_tools_to_catalog,load_langchain_catalog);[langchain]extra for live loading. - OpenAI Agents SDK adapter (#501).
adapters.openai_agentsconverts function tools to a catalog and run items toContextItems with tool-call → tool-output parentage (openai_agents_tools_to_catalog,from_openai_agents_run);[openai-agents]extra. - Google ADK adapter (#547).
adapters.google_adkconverts ADK tools to a catalog andSession.eventstoContextItems withfunction_call→function_responseparentage (google_adk_tools_to_catalog,from_google_adk_session);[google-adk]extra. - Integration table honesty (#401). The README Framework Integrations tables gain a Code adapter column distinguishing frameworks with an importable adapter (and its extra) from guide-only entries.
- Shared conversion toolkit (#454). New private
-
Context-engine tuning knobs: rendering, kinds, scoring, and overflow (#410, #411, #487, #510). A coherent pass over the context build pipeline's selection / scoring / rendering / budget surface, all opt-in with byte-identical defaults:
- Caller-owned rendering (#410).
ContextManager.build(...)/build_sync(...)accept arenderer: Callable[[list[ContextItem]], str]hook. When supplied, the caller owns the entire prompt layout — the section renderer, header, footer, and episodic/fact assembly are skipped — while budget-aware selection andpack.statsstill run. A ready-madecontextweaver.context.passthrough_rendererjoins items by raw text. - Retrieval/RAG kind + presentation override (#411). New
ItemKind.retrieved_docgives retrieved/RAG payloads a first-class home distinct from authoreddoc_snippets. A per-itemmetadata["section"]override decouples a prompt section label from the filteringkind, so presentation can change without changing per-phase filtering. - Phase-aware scoring weights + kind priority (#487).
ScoringConfiggainskind_priority(override the built-in item-kind priority table, validated to[0, 1]) andphase_overrides(per-Phaseweight configs; resolution: phase override → base config → built-ins, resolved one level deep — a per-phase override that itself definesphase_overridesis rejected withConfigError).explain=Truesurfaces the resolved weights viaContextBuildExplanation.resolved_weights. - Budget-overflow policy (#510).
ContextPolicy.overflow_action("drop"default /"warn"/"raise") plus an optionaloverflow_raise_kindsscope turn silent budget drops into a logged warning or aBudgetOverflowError(carrying the would-beBuildStats), so a dropped mandatory item surfaces as a debuggable error instead of bad output.
- Caller-owned rendering (#410).
-
CI gate consolidation and expansion (#522, #518, #456, #474, #526, #539). A coherent pass over the repo's generated-artifact / convention gating infrastructure:
- Unified drift harness (#522). A shared golden-file helper
(
scripts/_golden.py) now backs every generated-artifact check, and a singlemake drift-check/scripts/drift_check.pyregistry runs them all (schemas, scorecards, recorded demos,llms.txt, the context-rot SVG, and the new public-API manifest). Adding the next generated artifact costs one registry entry instead of a fresh copy of the render/compare/exit logic. Every registered generator returns a uniform exit code on a missing input (the gateway-scorecard generator no longer raisesSystemExit), so the harness aggregates a missing artifact consistently instead of aborting the whole run. - Public-API manifest (#518).
api/public_api.txtis a committed, signature-level snapshot of the public surface, regenerated bymake apiand gated bymake api-check(insidemake drift-check), so every public API addition, removal, or signature change is an explicit, reviewable diff. - Module-size gate (#456).
make module-size-checkmechanically enforces the documented ≤300-line convention: new non-exempt modules must stay under the limit, and pre-existing oversized modules are frozen at a grandfathered baseline (scripts/module_size_baseline.json) that may shrink but not grow. - Doc-snippet execution (#526).
make doc-snippets-checkextracts and runs the Python blocks inREADME.mdand a curated docs allowlist, so the first code an adopter copies is guaranteed to run against the current API. Illustrative blocks opt out with a<!-- snippet: skip -->marker. examples/+scripts/type-checked (#539).make typenow runsmypy src/ examples/ scripts/, extending strict typing to the most-copied code and the gating CI scripts.make ci⇄ CI alignment + workflow hygiene (#474).make cinow runs the consolidated drift gate, module-size, doc-snippet, and README-version checks, so a local pass mirrors the gating CI checks. CI gains workflowtimeout-minutes, a PRconcurrencygroup, and a docs-build job that gatesmkdocs buildon PRs (network-onlyweaver-conformancestays CI-only).
- Unified drift harness (#522). A shared golden-file helper
(
-
README roadmap drift guard (#531). The README now single-sources the framework integration table, marks the current roadmap row with the package version, and extends
readme-version-checkso stale roadmapcurrentmarkers fail CI instead of drifting silently. -
Gateway
tool_executedispatch hardening (#529, #512, #483, #482, #507). The gateway/proxy dispatch path gains four opt-in, deterministic controls, all inert by default so an unconfigured runtime behaves exactly as before:-
Retry/backoff (#529).
ProxyRuntime(retry_policy=RetryPolicy(...))retries transient upstream failures (timeouts, connection errors) with bounded exponential backoff + optional jitter. Tool-level error results and non-retryable codes are never retried; the injectedretry_sleepkeeps the schedule testable. -
Read-only response cache (#512).
ProxyRuntime(result_cache=ToolResultCache(...))memoises identicaltool_executecalls for tools the upstream marks read-only (operator opt-in via an optional allow-list). TTL- and size-bounded (LRU), argument-order-insensitive keys, errors never cached, invalidated on catalog refresh. Read-only eligibility derives from the unverified upstreamreadOnlyHint, so the docstring andgateway_spec.md§4.5 now warn thatread_only: truewithout anallowlist trusts each upstream's self-declaration and recommend pairing it with anallowlist for safety-critical tools. -
Dry run (#483).
tool_execute(..., dry_run=true)runs hydration, validation, and quota checks then returns aDryRunReport(resolvedtool_id, upstream name, validation outcome, unverified annotations, check list) without invoking upstream or writing artifacts. Invalid args still returnARGS_INVALID; dry runs never consume quota. -
Rate limiting / quotas (#482).
ProxyRuntime(rate_limiter=RateLimiter(...))enforces per-session and per-minute invocation limits per meta-tool and pertool_id, returning a structuredRATE_LIMITEDerror (withretry_after) on breach without dispatching upstream. -
Catalog-refresh consistency (#507). Documented and regression-tested that
refresh_catalogrebuilds all catalog-derived state (name index, validators, cache, graph) within one synchronous call, so a renamed/removed tool's staletool_idyields a cleanHYDRATE_FAILED— never a dispatch to the wrong upstream tool — and cross-upstream duplicate raw names collapse to the first.All four controls are loadable from
mcp serve --configvia theretry,rate_limits, andcacheblocks (validated at startup). New public symbols:RetryPolicy,RateLimit,RateLimitPolicy,RateLimiter,ToolResultCache,DryRunReport,call_with_retry(incontextweaver.adapters). Seedocs/gateway_spec.md§4.5–§4.6.
-
-
Persistent gateway sessions:
mcp serve --state-dir(#511).contextweaver mcp serveaccepts--state-dir DIR(and astate_dirconfig key) to wire the gateway'sContextManagerwith file-backed stores —{DIR}/events.sqlite3(SqliteEventLog) and{DIR}/artifacts/(JsonFileArtifactStore). Restarting against the same directory rehydrates prior event history and keeps previously issued artifact handles resolvable viatool_view; an unwritable directory fails fast with a clear startup error. Without the flag the gateway keeps its zero-config in-memory behaviour. Fixes a latent store-resolution bug where an empty persistent backend (which is falsy because it defines__len__) was silently replaced by an in-memory default;ContextManagernow resolves stores with explicitis Nonechecks. -
Remote store backends: Redis & S3 (#426). New
RedisEventLogandRedisArtifactStore(behindpip install 'contextweaver[redis]') andS3ArtifactStore(contextweaver[s3], works with AWS S3 / MinIO / R2 / GCS interop) give multi-process and long-lived gateways durable event/artifact storage beyond one process or disk. All three import their client library lazily — importingcontextweaver.storenever requires the extra — and are run through the #520 conformance kit (againstfakeredisandmotoin CI, no service container required).RedisArtifactStoresupports an optional per-artifact TTL and namespace isolation;S3ArtifactStoresupports a key prefix and a customendpoint_url. -
Stdlib SQLite episodic & fact stores (#496). New
SqliteEpisodicStoreandSqliteFactStore(contextweaver.store) give long-lived agents durable episodic/fact memory with zero external services, built on the same_sqlite_basescaffolding asSqliteEventLog. They are schema-versioned, re-instantiable against an existing file, and can share one database file with the event log (each store type tracks its own migrations under a distinct version table).SqliteEpisodicStore.searchdelegates ranking to a transient in-memory store, andSqliteFactStorekeepsfact_idordering, so swapping either backend for its in-memory counterpart leaves context-build output byte-identical. (apply_migrations/schema_versiongained an optionalversion_tableargument to support the shared-file layout.) -
Async store protocol variants (#495). New
AsyncEventLog,AsyncArtifactStore,AsyncEpisodicStore, andAsyncFactStoreprotocols (contextweaver.store.async_protocols) mirror the sync surface so network-backed backends can avoid blocking the async-first context pipeline.to_async(store)wraps any sync backend as the matching async protocol viaasyncio.to_thread(each bridge serializes concurrent awaits on itself with a per-bridge lock, since the in-memory backends are not thread-safe);to_sync(async_store, loop)does the inverse.ContextManagernow accepts async store backends (viaStoreBundle) and keeps the event loop responsive duringawait build(...)andawait build_call_prompt(...)by offloading the synchronous pipeline body to a worker thread while the async store I/O runs on a private loop thread; the loop thread is released automatically when the manager is garbage-collected (viaweakref.finalize, so no new publicclose()method is added toContextManager). Concurrent build calls on one manager serialize on an internal lock so the offloaded pipeline runs never race on the thread-unsafe in-memory stores. Async conformance checks (check_async_*_conformance) ship incontextweaver.store.testing. (Thread-affine backends such asSqliteEventLogare not validto_asynctargets; their async story is a future nativeaiosqlitebackend.) -
Store-protocol conformance kit (#520). New framework-agnostic
contextweaver.store.testingmodule —check_event_log_conformance,check_artifact_store_conformance,check_episodic_store_conformance,check_fact_store_conformance— each takes a factory for an empty backend and asserts the round-trip, ordering, and not-found contract the Context Engine relies on — including thatArtifactStore.put()stamps a sha256content_hashon the returned ref, now documented as a protocol contract because the firewall's idempotency short-circuit (#190) depends on it. It imports no test framework, so it ships in the core wheel and runs under pytest,unittest, or a plain script. The bundled in-memory, JSON-file, and SQLite backends are all run through it. -
JsonFileArtifactStoredurability hardening (#497). Writes are now atomic (temp file +os.replace), so a crash mid-write never leaves a truncated artifact;list_refs()reads an in-memory handle→ref index built once on construction instead of rescanning the directory on every call (only self-consistent metadata+data pairs are indexed, so the index never lists a handleget()cannot serve); and optionalmax_bytes/max_artifactsconstructor limits bound disk growth, raising the newArtifactStoreQuotaErrorwhen a write would breach them.put/delete/list_refsare serialised by an internal lock, making a single instance safe to share across threads in one process. -
ArtifactStoreQuotaErrorexception (subclass ofContextWeaverError), exported from the package root. -
Documented store thread-safety contract (#458) in
docs/agent-context/architecture.md, with concurrency tests covering atomic overwrites, concurrent distinct-handle reads/writes, and concurrent gatewaytool_viewdrilldown.
-
Artifact stores now persist a
content_hash(#466). BothInMemoryArtifactStore.putandJsonFileArtifactStore.putcompute and store the sha256 of the content on the returnedArtifactRef. This makes the firewall's re-processing idempotency short-circuit (#190) survive a process restart when the ref is reloaded from disk. The firewall no longer recomputes the hash separately. -
JsonFileArtifactStorepercent-encodes handles into filenames (#466). Handles containing characters that are legal in a handle but hostile in a filename — notably:(the firewall'sartifact:result:…shape, which opens an NTFS alternate data stream on Windows) — are now stored portably. On-disk filenames change accordingly (handle.data→enc(handle).data). -
InMemoryArtifactStore.to_dict/from_dictround-trip is now lossless (#466). Raw bytes are serialised (base64) alongside the metadata index, so a restored store resolvesget()/drilldown()instead of returning refs whose handles dereference to nothing — this is what lets aStoreBundlecarry firewalled artifacts across a restart. -
The gateway no longer assumes a concrete artifact-store backend (#472).
drilldownis part of theArtifactStoreprotocol, soProxyRuntime.view(tool_view) dropped itscast/type: ignoretoInMemoryArtifactStoreand works against any conformant store (e.g.JsonFileArtifactStore). -
Opt-in deterministic secret-redaction pass (#428). A new pure
contextweaver.secretsmodule (scrub_secrets(),contains_secret(),SecretPattern) detects well-known secret shapes (cloud access keys, provider tokens, private-key blocks, JWTs, credential-bearing URLs,key=valuecredential assignments).ContextManager(redact_secrets=True)scrubs firewall summaries and extracted facts before they reach the prompt;ProxyRuntime(..., redact_secrets=True)additionally scrubsChoiceCardtext. ASecretRedactorRedactionHook(registered as"secret") is available forContextPolicy.redaction_hooks. Off by default; only ever tightens a surface. -
Opt-in ingestion-time sensitivity classification (#542). New
SensitivityClassifierprotocol + built-inHeuristicSensitivityClassifier(anddetect_sensitivity()) raise an item's sensitivity label before enforcement so content callers forgot to label (e.g. tool results carrying credentials/PII) no longer defaults silently topublic. Wired viaContextManager(sensitivity_classifier=...); runs at the start of the sensitivity stage and over fact/episode header content. A classifier may only raise a label, never lower it. Every raise recordsmetadata["sensitivity_raised_by"](the classifier's type name) so the decision is auditable. -
Gateway untrusted-input hardening (#464, #484, #485, #488). The proxy/gateway ingest, validation, and dispatch boundary now defends against malformed or hostile upstream input:
- Defensive tool-def registration (#464). A malformed upstream tool
definition (non-dict, or missing a non-empty string
name) no longer aborts catalog refresh; the offending tool is skipped and recorded on the newProxyRuntime.last_refresh_report(CatalogRefreshReport). A newon_invalid="raise"mode fails loudly for development catalogs. - Untrusted-schema validation + validator caching (#484). Upstream
inputSchema/outputSchemaare meta-validated (check_schema) and bounded for serialized size, nesting depth, and property count (configurable viaSchemaLimits) at ingest, surfacingSchemaFindings on the refresh report. Compiledjsonschemavalidators are cached pertool_id, removing per-call recompilation from the hottool_executepath. A malformed schema surfaces as the newSCHEMA_INVALIDerror code. - Structured upstream-error taxonomy (#485).
GatewayErrorgains aretryablehint and the codesUPSTREAM_TIMEOUT,UPSTREAM_UNAVAILABLE,AUTH_FAILED,PERMISSION_DENIED, andRATE_LIMITED(classify_upstream_exception), withUPSTREAM_ERRORkept as the fallback. Model-visible upstream detail is now control-character-stripped and length-capped (redact_upstream_detail); operators keep full detail via logging. - Opt-in tolerant argument normalization (#488).
ProxyRuntime(tolerant_args=True)runs a deterministic, rule-based repair pass (normalize_args) before strict validation — stringified JSON objects and string→int/number/boolean/nullcoercions, only when the schema type demands it. Off by default (byte-identical behaviour); every repair is recorded under the result envelope'sprovenance["arg_repairs"].
- Defensive tool-def registration (#464). A malformed upstream tool
definition (non-dict, or missing a non-empty string
-
Header memory is now enforced (#450). Facts (
add_fact) and episode summaries (add_episode) injected into the prompt header are routed through the sensitivity floor/redaction action and the per-phasememory_factkind policy — closing a side-channel where header content bypassed stage-3 enforcement.FactandEpisodegained an optionalsensitivityfield (defaultspublic, round-trips into_dict/from_dict);add_fact/add_episodeaccept a keyword-onlysensitivity. A phase that excludesmemory_factno longer receives fact/episode text via the header. -
Redaction is effective end-to-end (#451). A redacted item now drops its
artifact_refand is stampedmetadata["redacted"]=True, so the rendered prompt no longer advertises an artifact handle thatdrilldowncould dereference back to the original, pre-redaction bytes.drilldownis now also policy-aware: a drilldown whose source item meets the sensitivity floor (or was redacted) raisesPolicyViolationErrorunless the newContextPolicy.allow_redacted_drilldown=Trueopt-out (defaultFalse, closed) is set, and an injected drilldown slice inherits its source item's sensitivity instead of defaulting topublic— so filtered content cannot be laundered back in via the drilldown path. -
deterministic=Truenow also gates LLM-backed extractors (#461). The firewall's fail-closed determinism guarantee previously covered only the summarizer; an LLM-backedExtractor(e.g.LlmExtractor) would still run. Both the large-output firewall path and the small-output ingest path now raiseDeterminismErrorrather than passing data through a model. -
contextweaver catalog lint(#538). A newcatalogCLI sub-app exposescatalog lint FILE, which runs the existingCatalogNormalizerplus cross-item reference validation over a catalog and reports findings (missing descriptions, duplicate/blank IDs, tag/whitespace hygiene, danglingdepends_on/requires). Accepts the native JSON/YAML catalog, a raw MCPtools/listarray, and the{"tools": [...]}snapshot shape. Supports--jsonand exits0(clean) /1(findings) /3(load error) for CI gating; never mutates the input file. -
Typed cross-item reference validation on catalog load (#519). New
routing.validate_references()+Catalog.validate_references()return aCatalogValidationReportof danglingdepends_on(item IDs) and unsatisfiedrequires(capabilities) references. Theload_catalog*loaders gained an additiveon_invalidkwarg ("warn"default → log per finding,"raise"→CatalogValidationErrorcarrying the report,"ignore"). Per-item deserialization failures now name the offending item byid(or index). -
Structured DEBUG/INFO routing diagnostics (#524).
logging.DEBUGoncontextweaver.routingnow traces the previously silent decision points: the tree-building strategy per subtree (INFO when a clustering/alphabetical fallback is taken), per-step beam pruning counts and pruned IDs, and the original-vs-augmented scoring query. Log messages are diagnostics, not API. -
Script-aware offline token heuristic —
HeuristicEstimator(#525). The default estimator (and thetiktokenoffline fallback) now counts dense scripts (CJK, Kana, Hangul, emoji) at ≈1 token/character instead oflen // 4, fixing a ~4× budget under-count on non-Latin content. Latin/ASCII estimates are unchanged. Dependency-free (stdlib range checks); exposed viatokens.heuristic_counter()andcontextweaver.HeuristicEstimator. -
Provider-calibrated token estimation (#493). Register accurate counters by name (
tokens.register_estimator(name, counter)) and select them viatokens.get_token_counter(provider);tiktokenstays the default. The estimator path that produced a build's numbers is recorded on the new additiveBuildStats.token_estimatorfield (e.g."tiktoken/cl100k_base","heuristic/v2", or a registered provider name). Newbenchmarks/token_calibration.py(+make token-calibration) renders the divergence table atdocs/token_calibration.mdacross ≥4 corpus shapes; providercount_tokenslegs are opt-in viaCW_TOKEN_CALIBRATION_PROVIDERSand never run in CI. -
Non-ASCII regression suite (#525).
tests/test_unicode_regression.pypins CJK/emoji/RTL behaviour across tokenization, budgeting, dedup, card rendering, serialization, and an in-process build. -
Dockerfile for the MCP gateway. A top-level
Dockerfile(+.dockerignore) bootscontextweaver mcp serve --gatewayover stdio against the packaged reference catalog, so an MCP client or automated scanner (e.g. Glama) can build, start, and introspect the gateway with no extra configuration. The image build validates the catalog with--dry-run. -
unregister_redaction_hook(name)(#463). Companion toregister_redaction_hookfor test hygiene and long-lived processes that need to replace a hook; raisesItemNotFoundErrorfor an unknown name. -
ValidationErrorexception (#463). Newcontextweaver.exceptions.ValidationError, raised by the pure-data layer (ChoiceCardconstruction,RoutingDecision.from_dict). It derives from bothContextWeaverErrorand the builtinValueError, so the custom hierarchy is catchable while existingexcept ValueErrorcall sites keep working. -
compact_tool_result(..., overwrite_sidecar=True)(#467). Opt-in escape hatch to replace an existing reserved_cwsidecar when round-tripping prior contextweaver output back through the facade (default refuses — see below).
- Custom view generators now fire on every ingestion/build path (#460). A
generator registered on
ContextManager.view_registrypreviously only ran oningest_tool_result; it now also runs on the build-time firewall batch andingest_mcp_result. Users with custom generators will start seeing them fire on the previously-unwired paths (the intended behavior); default-registry output is unchanged. - Collision-proof fact IDs (#462).
ContextManager.add_factnow mints IDs from a monotonic per-manager counter (fact:{key}:{seq}) instead of the store's current size. A delete followed by a newadd_factcan no longer re-mint an existing fact's ID and silently overwrite it; IDs stay deterministic for a fixed call sequence, and the call no longer scans the full store. A pre-populated store that collides with the counter now raisesDuplicateItemErrorloudly rather than overwriting. - Construction-time validation in core data types (#463).
ContextPolicy.sensitivity_actionis now typedLiteral["drop", "redact"]and validated in__post_init__(raisesConfigErrorimmediately instead of at the first build).ChoiceCardbounds violations now raiseValidationError(still aValueErrorsubclass).register_redaction_hookraisesConfigError(wasPolicyViolationError) on a duplicate name — a configuration mistake, not a policy violation. - Actionable graph-validation diagnostics (#523).
GraphBuildErrornow carries structuredcycle/edge/missing_rootattributes and names the specifics in its message: cycle failures report the full path (a -> b -> c -> a, deterministically), dangling edges name both ends, and a missing root lists known-node hints. The structured attributes are the stable contract; message text is not. Mode.adaptiveno longer fails silently (#521). Constructing aProfileConfig(mode=Mode.adaptive)now emits aUserWarningstating the mode is inert (no pipeline stage honours it; output equalsMode.strict).strict/seededare unaffected and persisted"adaptive"profiles still round-trip (re-warning on load).contextweaver mcp serveadvertises the installed package version.--versionnow defaults to the contextweaver package version (wasNone) when neither the flag nor the config file sets it, and the resolved version is shown in the serve lifecycle line.
compact_tool_resulthonours the reserved_cwnamespace (#467). A payload that already carries the reserved_cwsidecar key now raisesConfigErrorinstead of being silently clobbered (matching themetadata['_contextweaver']reserved-namespace rule). Passoverwrite_sidecar=Trueto opt into replacing it.RoutingDecision.from_dictno longer fabricates timestamps (#463). A missing or unparseabletimestampnow raisesValidationErrorinstead of substitutingdatetime.now(), keeping the pure-data layer deterministic and its round-trips lossless.- Removed load-bearing
asserts from library code (#467). Correctness checks infirewall_api.py,build.py, and_manager_build.pyare now explicit raises (ContextWeaverError-family) so they are not silently stripped underpython -O. Type-narrowing asserts are retained where annotated. Two new guard tests (tests/test_source_invariants.py) enforce this and the custom-exception rule going forward.
- MCP Registry listing + PyPI ownership marker (#348). Adds a
registry-publishable
server.jsondescribing the gateway as auvx contextweaver mcp serve --config <gateway.yaml>stdio server (linking to the gateway quickstart, not the raw API docs), anmcp-name: io.github.dgenio/contextweavermarker in the README for PyPI ownership verification, and a release-triggered GitHub Actions job that publishes to the official MCP Registry via GitHub OIDC (no interactive login required). - Trustworthy diagnostics across context builds and the MCP gateway
(#370, #378, #398, #414, #459).
BuildStats.dropped_itemsattributes every excluded item tosensitivity,dedup,kind_limit, orbudget; the production context pipeline now fires exclusion and budget lifecycle hooks. New versionedDiagnosticEvent/DiagnosticSinkAPIs include thread-safe in-memory and append-only JSONL sinks.ProxyRuntimeemits sanitized catalog, browse, hydrate, execute, and artifact-view events with counts, token/schema savings, failures, and latency. Operators can usecontextweaver mcp inspect,contextweaver mcp stats, andcontextweaver inspectfor JSON or Markdown reports without exposing raw queries, argument values, result text, prompt text, or artifact bytes. - Single-call firewall facade —
compact_tool_result()/firewalled_tool_result()(#399). Shrink one large tool result before it enters the prompt without standing up aContextManager. Returns aCompactResult(firewalled,payload,summary,facts,artifact_ref,stats). Exported from the top level. - Structured (lossless) firewall mode (#406). New
StructuredFirewall(keep=[...])plussummarize.structured.project/parse_path: keep an allow-list of JSON paths inline, offload the rest to the artifact store (retrievable viadrilldown), no LLM. Selectable throughcompact_tool_result(strategy=...)andContextManager.ingest_tool_result(..., firewall=StructuredFirewall(...)). An explicitstrategy="structured"now raisesConfigErroron non-JSON input instead of silently downgrading to a text summary;ingest_tool_resultappliesfirewall=only abovefirewall_threshold. - First-class firewall diagnostics —
FirewallStats(#402). Recordstriggered,strategy, original/summary chars+tokens (chars_saved/tokens_saved),artifact_ref, andsummarized_by_llm. Surfaced onResultEnvelope.firewall_stats, and aggregated onBuildStats.firewall_events/BuildStats.firewall_summary(). - Determinism guarantee —
deterministic=True(#404).ContextManager(deterministic=True)andcompact_tool_result(deterministic=...)fail closed with the newDeterminismErrorrather than passing data through an LLM-backed summariser;FirewallStats.strategy/summarized_by_llmmake the path auditable. - Built-in token counter —
contextweaver.tokens(#405). Publiccount()/get_token_counter()/heuristic_counter()(andTokenCounteralias) so callers never wiretiktokendirectly; firewall/FirewallStatsnumbers use the same counter. New no-opcontextweaver[tokenizers]extra documents the contract (tiktokenis already core, with offline fallback). - Daily Driver guide for MCP gateway operators (#394). New
docs/daily_driver.mdexplains when to use or bypass contextweaver, copy-paste operating instructions for common MCP clients, and a practical debug loop using route explanations,BuildStats, artifact views, and OTel. - MCP gateway security and data-flow model (#396). New
docs/security_model.mddistinguishes prompt exposure from raw artifact storage, documents trust and egress boundaries, and records the currenttool_view/ artifact-lifecycle limits tracked by #375. - Verified Claude Code MCP recipe (#429). Adds project/local registration
commands, a committed
.mcp.jsonexample, operating instructions, and troubleshooting verified against Claude Code 2.1.165. - Zero-install CLI smoke coverage (#437). Linux and macOS CI now build the
wheel and run its
contextweaverentry point through isolateduvxandpipxenvironments.
- Token estimates flow through one source of truth (#530). The
sensitivity-redaction placeholder, the firewall summary item, card budgeting
(
routing/cards.count_tokens,routing/packer), and memory-source costing no longer carry inlinelen // 4literals — they route through the configured estimator /contextweaver.tokens. The sensitivity stage receives the manager's estimator, so a custom counter is honoured on redaction paths. ASCII placeholder estimates are unchanged; offline non-Latin estimates become more accurate (and generally higher), which can shift selection outcomes in offline mode by design. The defaultContextManagerestimator is nowHeuristicEstimator(wasCharDivFourEstimator);heuristic_counter()returns it. BuildStatsaccounting now has one pipeline owner (#459).total_candidatesis measured after dependency closure and before sensitivity filtering;dropped_countincludes every later exclusion, so completed builds satisfyincluded_count + dropped_count == total_candidates. The report schema is version 2.- CI now exercises every committed generated-artifact drift check
(#389–#393).
llms.txt/llms-full.txt, recorded demo casts, and the gateway scorecard are gating checks on the Python 3.12 matrix cell; the deterministic smoke evaluation also runs there as a non-gating signal. - MCP client recipes now use the installed CLI (#371, #437). Claude
Desktop, Claude Code, GitHub Copilot, and Cursor configs launch
uvx contextweaver mcp serve; docs no longer describe the dedicated CLI as future work.examples/recipes/serve_gateway.pyremains a labelled legacy/custom-runtime example, while config tests reject references to that launcher across relative, absolute, POSIX, and Windows path forms. Relative catalog paths now resolve from the config file, and text results expose their stored artifact handle so clients can calltool_view.
- Canonical Frame-shaped ingestion seam —
ContextManager.ingest_envelope()(#352). The execution boundary (e.g. agent-kernel) firewalls and hands contextweaver an already-firewalledResultEnvelope(the native preimage of a weaver-specFrame); contextweaver appends a summary-onlyContextItemcarrying the artifact handle and does not re-derive firewalling from raw output. The raw-output APIs (ingest_tool_result,ingest_mcp_result) remain for standalone use but are now labelled non-canonical for spec compliance. New firewall boundary doc explains the contextweaver-firewall vs agent-kernel-firewall split and the seam; weaver-spec I-05 status updated accordingly. - Zero-Python config-file launch for the MCP gateway (#346).
contextweaver mcp serve --config gateway.yamlreads the catalog and serve options (mode,top_k,beam_width,cache_stable,name,version) from a single JSON/YAML file; explicit CLI flags still win. The catalog loader now also accepts the real-MCP-server snapshot shape ({"tools": [...]}) used by the recipes. New Cursor recipe (docs/recipes/cursor.md) plusexamples/recipes/gateway_config.yamlandexamples/recipes/cursor_mcp.json. (Bridging a live upstream MCP server over stdio remains follow-up on #346.) rank_collectedis now part of the public routing API (#288). The score-sort / active-filter helper is re-exported fromcontextweaver.routingso customNavigatorimplementations can reuse it.- End-to-end quality + cost benchmark vs a competent baseline (#345). New
benchmarks/e2e_quality.pyruns realistic tool-using tasks three ways — naive concat, a hand-built competent baseline, and contextweaver — scoring tool-selection accuracy, hallucinated-tool rate, end-task answer accuracy, prompt tokens, and estimated cost per strategy. Ships with a deterministic stub model (default, exercised in CI) and an opt-in real-model path (CW_E2E_LLM=1+ a user-suppliedcall_fn, no LLM SDK dependency). Newmake e2e-qualitytarget (non-gating) andbenchmarks/e2e/tasks.jsonfixtures. The published real-model headline is produced from a credentialed maintainer run.
- Decomposed
ContextManagerto meet the ≤300-line module guideline (#101). The pipeline logic already lived incontext/build.py(run_build_pipeline),context/route_build.py,context/call_prompt.py, andcontext/ingest.py; what remained was the manager's own method surface (manager.pywas 878 lines of thin delegating stubs + docstrings). Those stubs now live in flat, single-level partial-class mixins —_IngestMixin(context/_manager_ingest.py),_BuildMixin(context/_manager_build.py),_RoutingMixin(context/_manager_routing.py) — sharing a_ManagerStatebase (context/_manager_base.py) that declares the private-attribute contract.manager.pyis now 239 lines (only__init__, properties,drilldown, and mixin composition); every module is ≤300. The delegate pipeline functions are now typed against_ManagerState(interface segregation;ContextManagerinherits it via the mixins, so every call site is unchanged). No public API change — all 21 methods stay onContextManagerand the full test suite passes unmodified. - Unified routing metrics into
contextweaver.eval.metrics(#354).benchmarks/benchmark.pyandcontextweaver.eval.routingpreviously definedrecall@k/reciprocal_rankunder the same names with different semantics (fractional recall vs boolean hit-rate). They now share one canonical source of truth —recall_at_k(classic fractional recall),precision_at_k,reciprocal_rank— re-exported fromcontextweaver.eval. The benchmark scorecard numbers are unchanged;evaluate_routingnow reports fractional recall for multi-expected cases (identical for the common single-expected case). - Split
extras/memory/zep.pyintozep.py+_zep_common.pyso each module stays within the repo's ≤300-lines-per-module rule (PR #360 review). The public import path (contextweaver.extras.memory.zep) and its exports (ZepBackendError,ZepEpisodicStore,ZepFactStore) are unchanged.
- Routing history tool-id resolution narrows its exception handling.
route_build.resolve_tool_id_from_resultpreviously wrapped the parent event-log lookup in a bareexcept Exception, silently swallowing any error before falling back toparent_id. It now catches onlyItemNotFoundError(the documentedEventLog.getcontract), so unexpected store errors surface instead of being hidden (PR #363 review). - Provider message encoders no longer emit empty-content messages.
to_anthropic_messagesandto_gemini_contentsnow raise a clearCatalogError(with the offendingmsg_index) when a turn would serialise to empty or blank-text content, instead of letting the provider reject it later with an opaque400 ... messages: ... must have non-empty content. Messages that carry tool-use / tool-result / function-call blocks remain valid. OpenAI is intentionally left untouched: its Chat Completions API tolerates empty content and the empty-string assistant-content round-trip is an existing invariant (PR #230). - Zep backend defensively coerces scanned
tags/metadatawhen rebuildingEpisode/Factfrom persisted episodes: a non-listtags(e.g. a bare string, which previously iterated into characters) yields[], and a non-dictmetadata(which previously raised indict(...)) yields{}(PR #360 review). LlmSummarizer/LlmExtractorfallback warnings now include the underlying exception text, so a degraded LLM path is diagnosable (timeout vs auth vs parsing) instead of opaque (PR #360 review).