Skip to content

Auto-reconnect crashed and unauthorized bridge processes#103

Merged
jancurn merged 25 commits into
mainfrom
claude/auto-restart-bridge-processes-oidW9
Apr 5, 2026
Merged

Auto-reconnect crashed and unauthorized bridge processes#103
jancurn merged 25 commits into
mainfrom
claude/auto-restart-bridge-processes-oidW9

Conversation

@jancurn

@jancurn jancurn commented Mar 24, 2026

Copy link
Copy Markdown
Member

Summary

  • Crashed and unauthorized bridge processes are automatically reconnected in the background whenever sessions are enumerated (mcpc or mcpc grep)
  • Unauthorized sessions benefit from OAuth tokens refreshed by other sessions sharing the same profile
  • New connecting and reconnecting transient session states give users visibility into bridge lifecycle
  • Bridge detects when the server did not resume the original MCP session (different or missing session ID) and marks the session as expired
  • First keepalive ping fires 5s after startup (instead of waiting the full 30s interval) to catch stale sessions early
  • ASCII state transition diagram added to README

Key changes

Session states

  • Added connecting state — shown during initial mcpc connect before bridge is ready
  • Added reconnecting state — shown when a crashed/unauthorized bridge is being auto-reconnected
  • Both display as yellow filled dots (●) in human output; stale transient states (>10s with no PID) fall back to crashed

Auto-reconnect

  • consolidateSessions() identifies crashed and unauthorized sessions eligible for background reconnection (10-second cooldown via lastConnectionAttemptAt)
  • reconnectCrashedSessions() fires off restartBridge() calls without blocking the CLI command
  • The bridge process itself sets the final status (active on success, expired/unauthorized on failure)

MCP session ID validation

  • Bridge detects session ID mismatch after reconnection: if we sent an old session ID but the server returned a different one (or none at all), the session is marked expired
  • Covers the case where the server transparently creates a new session instead of returning 404

Documentation

  • README session lifecycle section updated with all 7 states and an ASCII state transition diagram
  • CLAUDE.md session states updated

Files changed

  • src/lib/types.tsSessionStatus type expanded, lastConnectionAttemptAt field added
  • src/lib/sessions.tsconsolidateSessions() marks crashed/unauthorized sessions for reconnection
  • src/lib/bridge-manager.tsreconnectCrashedSessions(), status management in ensureBridgeReady()
  • src/lib/session-client.tsreconnecting/active status transitions during retry
  • src/bridge/index.ts — Session ID mismatch detection, status: 'active' on success, early keepalive ping
  • src/cli/commands/sessions.tsconnecting state on initial connect, display logic for new states
  • src/cli/commands/grep.ts — Calls reconnectCrashedSessions after consolidation
  • README.md — State transition diagram
  • CLAUDE.md — Updated session states
  • CHANGELOG.md — New entries
  • test/e2e/suites/sessions/expired.test.sh — Accepts reconnecting as valid state
  • test/e2e/suites/sessions/failover.test.sh — Accepts reconnecting as valid state

Test plan

  • Unit tests pass (430/430)
  • Create session, kill bridge PID, run mcpc — should show reconnecting then live
  • Kill bridge, run mcpc twice rapidly — second run should not trigger another reconnect (10s cooldown)
  • Create two sessions to same OAuth server, let one become unauthorized, verify auto-reconnect picks up refreshed tokens
  • Kill bridge for a session with saved MCP session ID, verify expired if server doesn't resume

https://claude.ai/code/session_01FJXX4xys8aMoF4iVZbkC86

jancurn and others added 10 commits March 23, 2026 22:50
Adds a new grep command that searches tools, resources, and prompts
by name and description across all active MCP sessions. Supports
regex matching (-E), case-sensitive mode (-s), and type filters
(--tools, --resources, --prompts). Available as both a top-level
command (all sessions) and session command (single session).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Add recovery hints for crashed and expired sessions in session list

Show actionable hints under crashed and expired sessions in `mcpc` output,
similar to the existing hint for unauthorized sessions.

https://claude.ai/code/session_01D6ovkixxWohRHXaJP7Hh3g

* Remove extra text from crashed session hint

https://claude.ai/code/session_01D6ovkixxWohRHXaJP7Hh3g

---------

Co-authored-by: Claude <noreply@anthropic.com>
When enumerating sessions (e.g. `mcpc` or `mcpc grep`), crashed bridges
are now automatically restarted in the background without blocking the
command. A 10-second cooldown (connect timeout + 5s buffer) between
restart attempts prevents rapid retries. The lastRestartAttemptAt
timestamp is persisted in sessions.json and checked inside the file lock
to avoid concurrent restart races.

https://claude.ai/code/session_01FJXX4xys8aMoF4iVZbkC86
Avoid restarting a bridge that just crashed — wait for the cooldown
period after lastSeenAt to give the old process time to fully clean up
and prevent socket conflicts in shared-home environments.

https://claude.ai/code/session_01FJXX4xys8aMoF4iVZbkC86
Introduce two new transient session states that give users visibility
into bridge lifecycle transitions:

- 'connecting': shown during initial `mcpc connect` before bridge is ready
- 'reconnecting': shown when a crashed bridge is being auto-reconnected

Both display as yellow filled dots (●) in human output and as string
values in JSON output. Stale transient states (>10s with no PID)
automatically fall back to 'crashed'.

Also rename lastRestartAttemptAt → lastConnectionAttemptAt and
autoRestartCrashedSessions → reconnectCrashedSessions to better reflect
that this is a bridge reconnection, not a full session restart.

https://claude.ai/code/session_01FJXX4xys8aMoF4iVZbkC86
Resolve conflicts with PR #100 (grep command) and PR #105 (help fix):
- Take main's grep.ts (more features: --instructions, -m, capability-aware)
  and re-apply reconnectCrashedSessions call
- Take main's index.ts grep options
- Export DisplayStatus/getBridgeStatus (used by grep.ts) with new states
- Merge changelog entries

https://claude.ai/code/session_01FJXX4xys8aMoF4iVZbkC86
@jancurn jancurn changed the title Auto-restart crashed bridge processes in the background Auto-reconnect crashed bridge processes in the background Mar 24, 2026
claude and others added 15 commits March 24, 2026 16:53
Sessions with dead bridges may show as 'reconnecting' (auto-reconnect
in progress) instead of 'crashed', depending on timing. Update the
expired and failover tests to accept both states.

https://claude.ai/code/session_01FJXX4xys8aMoF4iVZbkC86
Move post-connection status management into the bridge process itself
rather than setting 'active' prematurely in reconnectCrashedSessions():

- Bridge sets status='active' after successful MCP connection
- Bridge detects session ID mismatch (server issued new ID instead of
  resuming) and marks session as 'expired' — prevents silently
  continuing with a different session identity
- reconnectCrashedSessions() no longer sets 'active'; it only reverts
  to 'crashed' on startup failure if bridge didn't set a terminal status

https://claude.ai/code/session_01FJXX4xys8aMoF4iVZbkC86
Include unauthorized sessions (alongside crashed) as candidates for
background auto-reconnection. When multiple sessions share the same
OAuth profile, one session refreshing the tokens allows others to
reconnect — the bridge reads from the OS keychain on startup and
picks up the refreshed tokens automatically.

https://claude.ai/code/session_01FJXX4xys8aMoF4iVZbkC86
Add a visual diagram showing all possible transitions between session
states, including the new connecting and reconnecting states. Update
the prose to reflect auto-reconnect behavior for crashed and
unauthorized sessions.

https://claude.ai/code/session_01FJXX4xys8aMoF4iVZbkC86
Two fixes to detect expired MCP sessions faster:

1. Session ID mismatch detection now catches all non-resume cases,
   including when the server doesn't return any session ID at all
   (previously only detected when both old and new IDs were present).

2. Bridge sends first keepalive ping 5s after startup instead of
   waiting the full 30s interval, catching stale sessions early.

https://claude.ai/code/session_01FJXX4xys8aMoF4iVZbkC86
Show 'reconnecting' as a single box with arrows from expired,
unauthorized, and crashed converging into it.

https://claude.ai/code/session_01FJXX4xys8aMoF4iVZbkC86
@jancurn jancurn changed the title Auto-reconnect crashed bridge processes in the background Auto-reconnect crashed and unauthorized bridge processes Apr 5, 2026
@jancurn jancurn merged commit 463d7aa into main Apr 5, 2026
6 checks passed
@jancurn jancurn deleted the claude/auto-restart-bridge-processes-oidW9 branch April 7, 2026 22:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants