Skip to content

fix(runtime): retain persistent ACP client across turns#265

Merged
steipete merged 2 commits into
openclaw:mainfrom
Sway-Chan:fix/runtime-persistent-client-pool
Apr 25, 2026
Merged

fix(runtime): retain persistent ACP client across turns#265
steipete merged 2 commits into
openclaw:mainfrom
Sway-Chan:fix/runtime-persistent-client-pool

Conversation

@Sway-Chan

@Sway-Chan Sway-Chan commented Apr 21, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #264.

Two edits to src/runtime/engine/manager.ts:

  1. runTurn finally now returns the client to pendingPersistentClients when the record is persistent (recordId does not contain ":oneshot:") and the backend reports a reusable session, restoring lifecycle symmetry with ensureSession.
  2. close() now cleans the pool on the non-discardPersistentState path, so AcpxManager.evictIdleRuntimeHandles (reason="idle-evicted") does not leak pooled CLI processes.

Test plan

  • Manual functional verification via equivalent edits to the distributed register.runtime-*.js, then restarting the gateway: two turns against the same persistent gemini-acp label; pid and agent_started_at stable across turns; pooled process reaped on next unrelated spawn (see issue for details).
  • CI / upstream test suite (for maintainer review).

Related

  • Linked issue describes the lifecycle asymmetry, reproduction, and validation data.

@enki

enki commented Apr 21, 2026

Copy link
Copy Markdown
Contributor

Thanks for digging into this. We confirmed the lifecycle bug affects our proving use of acpx/runtime.

I ported the same fix into our active runtime branches, with two extra regression cases:

  • persistent turns retain and reuse the same live ACP client
  • close() racing an active persistent turn does not allow the turn finalizer to re-pool the client or overwrite the closed state

Updated branches/PRs:

Both branch-specific runtime-manager suites pass, and the full pnpm test passed on rerun for both branches. The local integration branch has the same port as well.

So yes, this was a real bug and the core fix direction in this PR is correct. The only practical difference is that our current branches needed the fix applied to their refactored turn lifecycle path rather than cherry-picking this patch directly.

Sway-Chan and others added 2 commits April 25, 2026 06:43
ensureSession places persistent clients into pendingPersistentClients
(keepClientOpen=true), but runTurn's finally unconditionally closes
the client, breaking lifecycle symmetry and causing per-send cold-start
for persistent sessions.

- runTurn finally: mirror the persistent branch symmetrically by
  returning the client to the pool when the record is persistent
  (recordId does not contain ":oneshot:"), the record is still open,
  and the backend reports a reusable session.
- close(): when invoked without discardPersistentState (e.g. via
  AcpxManager.evictIdleRuntimeHandles with reason="idle-evicted"),
  clean the pool so we do not leak pooled CLI processes.

Validated locally by applying equivalent edits to dist/register.runtime-*.js,
restarting the gateway, then running two turns against the same persistent
gemini-acp label: PID and agent_started_at remained stable across turns,
and the pooled process was reaped on the next unrelated spawn which
triggered evictIdleRuntimeHandles.
@steipete steipete force-pushed the fix/runtime-persistent-client-pool branch from 62717ea to 715141a Compare April 25, 2026 05:48
@steipete steipete merged commit 9a54080 into openclaw:main Apr 25, 2026
@steipete

Copy link
Copy Markdown
Contributor

Landed via maintainer fixup and rebase onto main.

  • Gate: pnpm run check passed locally (format, typecheck, lint, build, viewer type/build, coverage: 530 tests)
  • Targeted proof: persistent runtime manager regression covers two turns reusing one client and runtime close reaping the pooled client
  • Land commit: 9a54080
  • Source fixup commit before rebase: Sway-Chan@715141a

Thanks @Sway-Chan!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[bug] persistent ACP client not returned to pool after runTurn

3 participants