Add vMCP assembly API in pkg/vmcp/app#5672
Conversation
Closes #5581 Creates pkg/vmcp/app with two public functions that encapsulate vMCP assembly behind a config-driven API: - app.BuildCore derives core.Config from vmcpconfig.Config + Options and calls core.New, returning the assembled VMCP for decoration. Handles all three discovery modes: static backends, Kubernetes discovered (reusing backendregistry.NewKubernetesBackendRegistry from #5541), and dynamic (groups manager). - app.BuildServerConfig derives *server.ServerConfig from the same config independently. Requires WithBackendRegistry for the Kubernetes discovered mode to prevent duplicate informer caches. - Functional Options carry the non-serialized runtime injectables: version, host/port, session TTL, telemetry provider, backend registry + watcher, REST config, embedded auth server RunConfig, status reporter, and elicitation requester. - Example embedder in app/internal/exampleembedder demonstrates the full BuildCore + decorate + BuildServerConfig + server.Serve flow. - Exports LateBoundElicitationRequester from pkg/vmcp/server so callers outside the package can wire composite-tool elicitation. Migrates pkg/vmcp/cli/serve.go to call app.BuildCore + app.BuildServerConfig, replacing the ~650-line hand-wired assembly with ~130 lines. Moves readHeaderForwardFromEnv from cli to app (the only consumer is now BuildCore). Removes cli functions that moved: discoverBackends, runDiscovery, vmcpNamespace, getStatusReportingInterval, readHeaderForwardFromEnv. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
- Remove dead outgoing auth registry and backend client construction from buildStaticBackendRegistry and buildDynamicBackendRegistry; those functions only need a discoverer, not a backend client (the BackendClient for aggregation is constructed once at the top of BuildCore) - Remove the unused _ vmcp.BackendClient parameter from runDiscovery - Correct the package doc telemetry comment: neither BuildCore nor BuildServerConfig builds a provider; callers who want telemetry must inject one via WithTelemetryProvider - Replace "Kubernetes dynamic mode" with "Kubernetes discovered mode" in options.go and the example embedder to match the rest of the codebase convention Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #5672 +/- ##
========================================
Coverage 70.42% 70.42%
========================================
Files 651 655 +4
Lines 66285 66464 +179
========================================
+ Hits 46678 46810 +132
- Misses 16232 16272 +40
- Partials 3375 3382 +7 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
tgrunnagle
left a comment
There was a problem hiding this comment.
Multi-Agent Consensus Review
Agents consulted: api-design, architecture, go-idioms, test-coverage, general-quality
Consensus Summary
| # | Finding | File | Consensus | Severity | Action |
|---|---|---|---|---|---|
| F1 | Discovered-mode watcher silently discarded in BuildCore | build_core.go:199 | 9/10 | MEDIUM | Fix |
| F2 | exampleembedder not anchored in CI build graph | exampleembedder/embedder.go:1 | 9/10 | MEDIUM | Fix |
| F3 | Static/dynamic registries built twice (double-discovery risk) | build_core.go:146 | 7/10 | MEDIUM | Discuss |
| F4 | TestBuildCore_InlineDiscovery bypasses the path it claims to test | build_core_test.go:43 | 8/10 | MEDIUM | Fix |
| F5 | Three deleted CLI tests have no port in new package | serve_test.go (deletions) | 7/10 | MEDIUM | Fix |
| F6 | Shallow copy for OutgoingAuth missing — mutation escapes to caller | build_server_config.go:58 | 8/10 | MEDIUM | Fix |
| F7 | "discovered" compared as magic string literal |
build_core.go:149 | 7/10 | LOW | Fix |
| F8 | WithBackendRegistry accepts nil registry silently | options.go:144 | 7/10 | LOW | Fix |
| F9 | Asymmetric docstring language for discovered mode | build_server_config.go:29 | 7/10 | LOW | Fix |
| F10 | applyOptions panics on nil Option in slice | options.go:184 | 7/10 | LOW | Fix |
| F11 | pkg/vmcp/app missing from vmcp-library.md stability table | docs/arch/vmcp-library.md | 7/10 | LOW | Fix |
| F12 | LateBoundElicitationRequester lacks stability annotation | elicitation_latebound.go:27 | 7/10 | LOW | Fix |
| F13 | Close() error discarded without explanation | build_core.go:143 | 7/10 | LOW | Fix |
| F14 | VMCP_NAMESPACE error message starts uppercase | build_core.go:181 | 8/10 | LOW | Fix |
| F15 | Rate-limit cleanup captures context.Background() | build_server_config.go:148 | 7/10 | LOW | Fix |
| F16 | Log message starts uppercase, inconsistent | build_core.go:177 | 7/10 | LOW | Fix |
| F17 | Cleanup-idempotency tests are vacuously true | build_server_config_test.go:153 | 9/10 | LOW | Fix |
| F18 | deriveHealthMonitorConfig error path untested | build_core_test.go | 7/10 | LOW | Fix |
| F19 | BuildServerConfig static-backend mode untested | build_server_config_test.go | 7/10 | LOW | Fix |
| F20 | exampleembedder cleanup guard pattern is brittle | exampleembedder/embedder.go:59 | 7/10 | LOW | Polish |
Overall
This PR introduces pkg/vmcp/app as the new public assembly package for vMCP, implementing the design specified in issue #5581. The approach — two independent derivation functions (BuildCore, BuildServerConfig) sharing stateful collaborators via functional options — is architecturally sound and correctly reflects the "independent projections from the same config" pattern the issue describes. The CLI migration (serve.go reduced from ~700 to ~360 lines) validates the API in the real production caller.
The main risk is a behavioral asymmetry in the Kubernetes "discovered" mode: BuildServerConfig enforces WithBackendRegistry at runtime (returning an error if absent), while BuildCore silently starts its own K8s informer with no readiness-gating handle when the option is omitted. An embedder who provides WithBackendRegistry to BuildServerConfig but forgets BuildCore gets two informer caches and no clear error. The docstring says "SHOULD provide" rather than enforcing it — this understates the risk for a public API intended for external embedders.
The test gaps (F4, F5: misleadingly-named test, three deleted tests with no port) and the missing importgraph anchor for the exampleembedder (F2) are addressable without design changes and worth fixing before downstream adoption.
Documentation
docs/arch/vmcp-library.md: The stability table has no entry for the new pkg/vmcp/app package (intended for downstream embedders). Add a row with an Experimental designation covering BuildCore, BuildServerConfig, and Option. Also note the LateBoundElicitationRequester addition to the pkg/vmcp/server entry.
F5 — Deleted tests with no port (no diff line available): TestRunDiscovery_KubernetesGroupNotFound, TestRunDiscovery_ZeroBackends, and TestVMCPNamespace were deleted from pkg/vmcp/cli/serve_test.go. The behavior each tested still exists verbatim in build_core.go (lines 240–253) and build_server_config.go. Port all three to pkg/vmcp/app/build_core_test.go / pkg/vmcp/app/build_server_config_test.go.
Generated with Claude Code
| } | ||
| slog.Info("Kubernetes backend registry started", "namespace", namespace, "group", cfg.Group) | ||
| return reg, nil | ||
| } |
There was a problem hiding this comment.
[F1 — MEDIUM] Discovered-mode watcher silently discarded — asymmetric enforcement (9/10, api-design + go-idioms)
When OutgoingAuth.Source == "discovered" and no WithBackendRegistry is provided, BuildCore discards the watcher returned by backendregistry.NewKubernetesBackendRegistry. The watcher goroutine starts, but the readiness signal (informer cache-sync) is lost and there is no error. BuildServerConfig enforces the opposite for the same mode, returning an error if no registry is injected — callers who provide WithBackendRegistry to BuildServerConfig but forget BuildCore silently end up with two K8s informer caches and no readiness gating.
Either (a) make BuildCore also return an error when OutgoingAuth.Source == "discovered" and no WithBackendRegistry is provided (matching BuildServerConfig's stricter stance), or (b) at minimum emit slog.Warn("Kubernetes watcher started without readiness gating; provide WithBackendRegistry") and upgrade the docstring from "SHOULD" to "MUST".
| @@ -0,0 +1,97 @@ | |||
| // SPDX-FileCopyrightText: Copyright 2025 Stacklok, Inc. | |||
There was a problem hiding this comment.
[F2 — MEDIUM] exampleembedder not anchored in the CI build graph (9/10, architecture + test-coverage)
The sibling package pkg/vmcp/backendregistry/internal/exampleembedder is anchored in CI via pkg/vmcp/backendregistry/importgraph_test.go (var _ = exampleembedder.BuildServer), ensuring it compiles on every task test run. This package has no such anchor — nothing imports it, so a breaking change to BuildCore, BuildServerConfig, or vmcpserver.Serve signatures goes undetected.
Add pkg/vmcp/app/importgraph_test.go containing var _ = exampleembedder.BuildServer (with the appropriate import), mirroring the existing pattern in pkg/vmcp/backendregistry/importgraph_test.go.
| slog.Info("using pre-built backend registry from option") | ||
| return o.backendRegistry, nil | ||
| } | ||
| if len(cfg.Backends) > 0 { |
There was a problem hiding this comment.
[F3 — MEDIUM] Static/dynamic registries built independently per Build call (7/10, architecture)
For static-backends and dynamic (local groups manager) modes, each Build* call independently runs its full discovery sequence — two calls to EnsureDefaultGroupExists, groups.NewManager, and Discover. The two registry instances can reflect different snapshots of local groups state. The Kubernetes path is correctly protected by requiring WithBackendRegistry, but the same consistency risk exists for non-K8s modes.
At minimum, document in options.go that for non-K8s modes the two discoveries can snapshot state at different moments, and recommend WithBackendRegistry when registry consistency between BuildCore and BuildServerConfig is required.
|
|
||
| // TestBuildCore_InlineDiscovery covers the inline (no static backends, no K8s) | ||
| // discovery path. BuildCore must succeed and return a non-nil VMCP and cleanup. | ||
| func TestBuildCore_InlineDiscovery(t *testing.T) { |
There was a problem hiding this comment.
[F4 — MEDIUM] TestBuildCore_InlineDiscovery bypasses the path it claims to test (8/10, test-coverage)
The test is named "InlineDiscovery" and its comment says it covers "the inline (no static backends, no K8s) discovery path", but it passes WithBackendRegistry(reg, nil), which causes BuildCore to skip resolveBackendRegistryForCore entirely. The actual dynamic-discovery path (buildDynamicBackendRegistry → migration.EnsureDefaultGroupExists → groups.NewManager) has zero coverage in this PR.
Rename the test to TestBuildCore_WithInjectedRegistry_SkipsDiscovery to accurately describe what it asserts, and track coverage of the real dynamic path separately.
| // Work on a shallow copy to avoid mutating the caller's config. | ||
| // InjectSubjectProviderNames mutates IncomingAuth strategies in place, so we | ||
| // must own our copy of the struct before modifying it. | ||
| cfgCopy := *cfg |
There was a problem hiding this comment.
[F6 — MEDIUM] Shallow copy does not cover OutgoingAuth — mutation escapes to caller (8/10, go-idioms)
BuildServerConfig copies cfg and cfg.IncomingAuth before calling InjectSubjectProviderNames. However, InjectSubjectProviderNames also mutates BackendAuthStrategy values in cfg.OutgoingAuth.Backends (a map[string]*BackendAuthStrategy). The shallow cfg copy still shares the caller's original OutgoingAuthConfig pointer, so SubjectProviderName mutations on token-exchange strategies are visible in the caller's original cfg after BuildServerConfig returns.
Also copy OutgoingAuth when non-nil:
| cfgCopy := *cfg | |
| cfgCopy := *cfg | |
| cfg = &cfgCopy | |
| if cfg.IncomingAuth != nil { | |
| incomingCopy := *cfg.IncomingAuth | |
| cfg.IncomingAuth = &incomingCopy | |
| } | |
| if cfg.OutgoingAuth != nil { | |
| oaCopy := *cfg.OutgoingAuth | |
| cfg.OutgoingAuth = &oaCopy | |
| } |
Note: if BackendAuthStrategy map values also need deep-copying (to prevent strategy mutations reaching the caller), copy the map entries too.
| // backendregistry.NewKubernetesBackendRegistry. The registry starts empty and is | ||
| // populated by the watcher goroutine, which is bound to ctx. | ||
| func buildKubernetesBackendRegistry(ctx context.Context, cfg *vmcpconfig.Config, o *options) (vmcp.BackendRegistry, error) { | ||
| slog.Info("Kubernetes discovered mode: building backend registry") |
There was a problem hiding this comment.
[F16 — LOW] Log message starts uppercase, inconsistent with peers (7/10, go-idioms)
All sibling log messages in this file use lowercase ("static mode", "dynamic mode", "discovering backends"). This one starts with "Kubernetes" — a minor inconsistency worth fixing for uniformity.
| slog.Info("Kubernetes discovered mode: building backend registry") | |
| slog.Info("kubernetes discovered mode: building backend registry") |
There was a problem hiding this comment.
Resolved in b40482b by removal: this log line lived in buildKubernetesBackendRegistry, which was deleted as part of the F1 fix (BuildCore now requires WithBackendRegistry for the discovered source).
|
|
||
| // TestBuildServerConfig_CleanupIdempotent verifies that cleanup() does not panic | ||
| // and can be called multiple times. | ||
| func TestBuildServerConfig_CleanupIdempotent(t *testing.T) { |
There was a problem hiding this comment.
[F17 — LOW] Cleanup-idempotency test is vacuously true (9/10, api-design + test-coverage)
This test calls cleanup() twice on a config that produces no acquired resources (no authServerRunConfig, no rate-limiting), so cleanupFuncs is empty in both calls. The claimed idempotency is vacuously true — the real double-close behavior of the embedded auth server is not exercised. The same applies to TestBuildCore_CleanupReleasesOnlyOwnedResources in build_core_test.go.
Either rename to TestBuildServerConfig_CleanupDoesNotPanic (accurately describing the assertion), or add a variant that injects a mock authServerRunConfig with a Close method counted for single-invocation verification.
| _, cleanup, err := app.BuildCore(t.Context(), nil) | ||
| require.Error(t, err) | ||
| require.NotNil(t, cleanup) // noop, but must be non-nil | ||
| } |
There was a problem hiding this comment.
[F18 — LOW] deriveHealthMonitorConfig error path untested (7/10, test-coverage)
build_core.go lines 264–268 return an explicit error when HealthCheckInterval > 0 but UnhealthyThreshold < 1. No test in this file exercises that path.
Add a test at the end of this file:
func TestBuildCore_HealthMonitorConfig_InvalidThreshold(t *testing.T) {
t.Parallel()
cfg := minimalInlineConfig()
cfg.Operational = &vmcpconfig.OperationalConfig{
FailureHandling: &vmcpconfig.FailureHandlingConfig{
HealthCheckInterval: vmcpconfig.Duration(time.Second),
UnhealthyThreshold: 0, // invalid
},
}
_, cleanup, err := app.BuildCore(t.Context(), cfg)
require.Error(t, err)
assert.Contains(t, err.Error(), "unhealthy threshold")
assert.NotNil(t, cleanup)
}| _, cleanup, err := app.BuildServerConfig(t.Context(), nil) | ||
| require.Error(t, err) | ||
| require.NotNil(t, cleanup) | ||
| } |
There was a problem hiding this comment.
[F19 — LOW] BuildServerConfig static-backend mode has no test (7/10, test-coverage)
BuildCore has TestBuildCore_StaticBackends verifying that non-empty cfg.Backends causes BuildCore to build its own registry. BuildServerConfig has no equivalent — the symmetric behavior in resolveBackendRegistryForServerConfig is untested.
Add TestBuildServerConfig_StaticBackends mirroring the BuildCore test: set cfg.Backends with a non-empty list, call BuildServerConfig without WithBackendRegistry, and assert success with a non-nil BackendRegistry in the returned config.
| // Guard: if anything below fails, release the core. | ||
| cleanupCoreOnErr := true | ||
| defer func() { | ||
| if cleanupCoreOnErr { |
There was a problem hiding this comment.
[F20 — LOW] Cleanup guard pattern is brittle example code (7/10, go-idioms + general-quality)
The two boolean flags (cleanupCoreOnErr, cleanupSrvOnErr) are correct but require future maintainers to add a third flag for any new resource acquired between the two disarm lines — easy to miss. Since this is example code that embedders will learn from, a more idiomatic pattern would reduce copy-paste risk.
Consider consolidating into a single disarmed := false flag set after vmcpserver.Serve succeeds (success implies both BuildCore and BuildServerConfig succeeded), or add a comment naming the invariant: // If you add a new resource here, add a corresponding cleanup guard before Serve.
Addresses #5672 review comments: - MEDIUM build_core.go (3487247177, F1): make BuildCore return an error for the discovered outgoingAuth source without WithBackendRegistry, symmetric with BuildServerConfig; delete the internal Kubernetes registry build path and the now-dead WithRESTConfig option. Resolves F9/F14/F16 by removal. - MEDIUM exampleembedder/embedder.go (3487247182, F2): add importgraph_test.go anchoring the example in the CI build graph. - MEDIUM build_core.go (3487247185, F3): document that static/dynamic modes build registries independently and may snapshot state at different moments. - MEDIUM build_core_test.go (3487247187, F4): rename misleading InlineDiscovery test to WithInjectedRegistry_SkipsDiscovery; drop the duplicate. - MEDIUM serve_test.go deletions (F5): port TestRunDiscovery_ZeroBackends, TestRunDiscovery_KubernetesGroupNotFound, and TestVMCPNamespace to the app package. - MEDIUM build_server_config.go (3487247188, F6): deep-copy OutgoingAuth before InjectSubjectProviderNames so strategy mutations do not escape to the caller. - LOW build_core.go/build_server_config.go (3487247195, F7): add OutgoingAuthSource{Inline,Discovered} constants and use them. - LOW options.go (3487247197, F8): panic on nil registry in WithBackendRegistry. - LOW build_server_config.go (3487247201, F9): align discovered-mode docstring to MUST. - LOW options.go (3487247203, F10): skip nil Option in applyOptions. - LOW vmcp-library.md (F11): add pkg/vmcp/app row; note LateBoundElicitationRequester. - LOW elicitation_latebound.go (3487247205, F12): add Experimental annotation. - LOW build_core.go (3487247208, F13): log core Close error in cleanup. - LOW build_server_config.go (3487247210, F15): document context.Background() in cleanup. - LOW build_*_test.go (3487247212, F17): rename vacuous cleanup tests to DoesNotPanic. - LOW build_core_test.go (3487247214, F18): add health-monitor invalid-threshold test. - LOW build_server_config_test.go (3487247215, F19): add static-backend test. - LOW exampleembedder/embedder.go (3487247216, F20): consolidate cleanup guards into a single disarm flag. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Summary
Closes #5581
vMCP's assembly — turning a
vmcpconfig.Configinto a running server — previously livedentirely in
pkg/vmcp/cli/serve.go(~700 lines). Any embedder (e.g., the enterprisegateway) that wanted a vMCP server had to replicate that wiring by hand, and would break
whenever OSS changed its assembly sequence or internal rules. This PR encapsulates the
assembly behind a public API in the new
pkg/vmcp/apppackage so that callers hand overconfig and get back a
core.VMCPto decorate and a*server.ServerConfigto serve —without reimplementing vMCP internals.
app.BuildCorederivescore.Configfromvmcpconfig.Config(+ options) andreturns the assembled
core.VMCPfor decoration.app.BuildServerConfigindependently derives*server.ServerConfigfrom the sameconfig — the transport config is not a byproduct of core construction.
Optionvalues carry non-serializable runtime injectables: version, host/port,session TTL, telemetry provider, backend registry + watcher, REST config, embedded auth
server run config, status reporter, and elicitation requester.
injected via options so one instance is shared between both
Build*calls — no duplicateOTEL pipelines or K8s informer caches.
pkg/vmcp/cli/serve.gois migrated to callapp.BuildCore+app.BuildServerConfig,reducing it from ~700 lines to ~360 and removing all domain assembly logic from the CLI.
LateBoundElicitationRequesterand its constructor are exported so embedders can wirecomposite-tool elicitation without access to package-internal types.
Type of change
Test plan
task test)task lint-fix)Unit tests in
pkg/vmcp/app/build_core_test.goandpkg/vmcp/app/build_server_config_test.gocover both static-backends and dynamic-discovery modes, option application, and nil-config
rejection. The
pkg/vmcp/app/internal/exampleembedderpackage demonstrates the fullBuildCore → decorate → BuildServerConfig → server.Serveflow and validates that thepublic API types plug directly into
server.Serve.Changes
pkg/vmcp/app/options.goOptionfunctional type and allWith*constructorspkg/vmcp/app/build_core.goBuildCoreand backend-registry resolution helperspkg/vmcp/app/build_core_test.goBuildCorepkg/vmcp/app/build_server_config.goBuildServerConfigand transport derivation helperspkg/vmcp/app/build_server_config_test.goBuildServerConfigpkg/vmcp/app/header_forward_env.gopkg/vmcp/cli/(no logic change)pkg/vmcp/app/header_forward_env_test.gopkg/vmcp/cli/(no logic change)pkg/vmcp/app/internal/exampleembedder/embedder.gopkg/vmcp/cli/serve.goapp.BuildCore+app.BuildServerConfigpkg/vmcp/cli/serve_test.goserve.gopkg/vmcp/server/elicitation_latebound.goLateBoundElicitationRequester,NewLateBoundElicitationRequester, andBindDoes this introduce a user-facing change?
No. The CLI behavior of
thv vmcp serveis unchanged. The newpkg/vmcp/apppackage isadditive public API intended for embedders; no existing user-facing behavior is modified.
Special notes for reviewers
Build*functions are intentionally independent derivations from the samevmcpconfig.Config, mirroring the existingderiveCoreConfig/deriveServerConfigsplit.BuildCoredoes not return aServerConfig— the transport config is not a byproduct ofcore construction.
WithBackendRegistry;BuildServerConfigreturns an error if it is absent, to preventstarting two K8s informer caches.
LateBoundElicitationRequesterexport (pkg/vmcp/server) adds a small amount ofsurface to an existing unexported type. The package-internal
newLateBoundElicitationRequesterand
bindaliases are retained soserver.Newcall sites need no change.Generated with Claude Code