Fix macOS+Bun E2E hang (cross-binary keychain read) + per-test timeout watchdog#265
Merged
Merged
Conversation
The macOS/Bun E2E job hung until GitHub's 6h hard kill, while macOS/Node and Linux/Bun pass the same suite. Because run.sh only prints results after every test finishes, the log never revealed which test hung. The Unix parallel path (xargs, used on macOS and Linux) had no per-test timeout — only the Windows path did. Add a watchdog to run_test that kills a test exceeding PER_TEST_TIMEOUT (default 180s, override via E2E_PER_TEST_TIMEOUT), records it as a failure, and appends a TIMEOUT notice naming the test once it's dead (after the process exits, to avoid a write race that would clobber the message). The Windows path now shares the same knob. Also add timeout-minutes: 45 to every E2E job as a hard backstop. This converts the hang into a fast, self-diagnosing failure. It does not fix the underlying macOS+Bun hang itself, which appears specific to the native @napi-rs/keyring path exercised only on macOS (Linux/Bun falls back to file storage); the next macOS/Bun run will name the culprit test(s). Refs #248 https://claude.ai/code/session_01417BuEkifr5jSSx6R2MYCB
jancurn
pushed a commit
that referenced
this pull request
Jun 8, 2026
…eout When the watchdog kills a hung test, capture diagnostics first so the failure log is actionable instead of opaque: the hung process tree, a native stack (macOS `sample`, no privileges needed) of each stuck mcpc/bridge process, and the per-test bridge log. Written to a side file while the test is still alive to avoid racing its own output, then appended after the kill. This is what's needed to pin down the macOS+Bun hang in sessions/proxy and sessions/unauthorized-auto-detect, which only reproduces in CI. Refs #265 https://claude.ai/code/session_01417BuEkifr5jSSx6R2MYCB
jancurn
pushed a commit
that referenced
this pull request
Jun 8, 2026
…hang The two E2E tests that hung on macOS+Bun (sessions/proxy and sessions/unauthorized-auto-detect) were blocked inside a synchronous native macOS Keychain read (SecKeychainFindGenericPassword). macOS keychain ACLs are per-binary, so reading an item created by a *different* binary triggers a Security access prompt that blocks forever in headless CI. The CLI runs under bun but spawned the bridge under a hardcoded `node`, so the bridge read keychain items the CLI had written under a different binary — e.g. the proxy bearer token (sessions/proxy). Spawn the bridge with process.execPath so the CLI and bridge share one runtime, and thus one keychain identity. As a bonus, a Bun user no longer needs Node on PATH for the bridge to start. Also seed the keychain in sessions/unauthorized-auto-detect via the test's own runtime instead of a hardcoded `node`, for the same reason (the bun CLI reads back what the seed wrote). Diagnosed from the timeout watchdog's `sample` backtrace of the hung process. Refs #265 https://claude.ai/code/session_01417BuEkifr5jSSx6R2MYCB
jancurn
pushed a commit
that referenced
this pull request
Jun 8, 2026
… hang) The macOS+Bun E2E hang (sessions/proxy, sessions/unauthorized-auto-detect) was a synchronous native macOS Keychain read blocking on a Security access prompt: keychain ACLs are per-binary, so reading an item created by a *different* binary prompts — and blocks forever in headless CI. The bridge runs under Node while the CLI runs under Bun, and the bridge read the proxy bearer token straight from the keychain — an item the bun CLI had written, i.e. a cross-binary read. The CLI now reads it before spawn (same keychain identity) and hands it to the bridge over IPC, exactly as it already does for headers and OAuth credentials; the bridge no longer touches the keychain outside the sanctioned OAuth-refresh path. The bridge stays on Node deliberately: its proxy/TLS support uses undici, which Bun's fetch ignores, so a Bun bridge would break --insecure and HTTPS_PROXY. Also seed the keychain in sessions/unauthorized-auto-detect via the test's own runtime instead of a hardcoded `node`, for the same per-binary reason. Refs #265 https://claude.ai/code/session_01417BuEkifr5jSSx6R2MYCB
jancurn
pushed a commit
that referenced
this pull request
Jun 8, 2026
…hang The macOS+Bun E2E hang (sessions/proxy, sessions/unauthorized-auto-detect) was a synchronous native macOS Keychain read blocking on a Security access prompt: keychain ACLs are per-binary, so reading an item created by a *different* binary prompts — and blocks forever in headless CI. The CLI runs under Bun but spawned the bridge under a hardcoded `node`, so the bridge read keychain items the Bun CLI had written (e.g. the proxy bearer token). Spawn the bridge with process.execPath so the CLI and bridge share one runtime — and one keychain identity. A Bun user also no longer needs Node installed for the bridge to start. Bun's fetch ignores undici's TLS-bypass dispatcher, so `--insecure` cannot skip certificate verification under a Bun bridge. Rather than be silently ineffective, startBridge now fails with a clear error when `--insecure` is used under Bun; covered by a runtime-aware e2e test. Also seed the unauthorized-auto-detect keychain via the test's own runtime instead of a hardcoded `node` (same per-binary reason). Refs #248, #265 https://claude.ai/code/session_01417BuEkifr5jSSx6R2MYCB
jancurn
pushed a commit
that referenced
this pull request
Jun 8, 2026
With the bridge now running under the CLI's runtime, its keychain access is same-binary — but the bridge still read the proxy bearer token directly from the keychain in startProxyServer(), the lone keychain read outside the sanctioned OAuth-refresh path (CLAUDE.md / #55). A post-spawn keychain read can also race the bridge's IPC-credential timer if the keychain is locked. Read the token in the CLI before spawn and hand it to the bridge over the existing set-auth-credentials IPC message, exactly as headers and OAuth credentials are already delivered. The keychain stays the at-rest store (so authenticated proxy sessions survive restarts without re-passing the flag); only the reader moves from the bridge to the CLI. The bridge now touches the keychain only on the OAuth-refresh path. Refs #248, #265 https://claude.ai/code/session_01417BuEkifr5jSSx6R2MYCB
jancurn
pushed a commit
that referenced
this pull request
Jun 10, 2026
Bun's fetch ignores undici's TLS-bypass dispatcher, so under a Bun bridge the `--insecure` flag could not skip certificate verification. Rather than fail loudly, set NODE_TLS_REJECT_UNAUTHORIZED=0 in the bridge process's environment when --insecure is given: Bun honors it (verified against a self-signed server, on the undici-fetch path the bridge actually uses), and it is a harmless no-op alongside the existing undici dispatcher on Node. --insecure now works under both runtimes — which also matches the pre-PR behaviour, where the bridge ran under Node. Set via the spawn env so it is in place before the runtime initializes TLS, and scoped to the one bridge process. The insecure e2e test returns to runtime-agnostic (it asserts --insecure works under whatever runtime runs it), and the CHANGELOG no longer claims a Bun --insecure change — net of this PR there is none. Also log proxyBearerToken presence in the bridge's auth-credentials debug summary. Refs #248, #265 https://claude.ai/code/session_01417BuEkifr5jSSx6R2MYCB
jancurn
added a commit
that referenced
this pull request
Jun 10, 2026
…266) Completes #265 (which added only the per-test timeout watchdog): fixes the macOS+Bun E2E hang. Root cause, pinned via the watchdog's `sample` backtrace: a Bun CLI spawned the bridge under a hardcoded `node`, so the Node bridge did a cross-binary macOS Keychain read that blocks on a Security prompt in headless CI. - Run the bridge under the CLI's runtime (`process.execPath`) — one keychain identity; a Bun user no longer needs Node installed. - Deliver the proxy bearer token to the bridge over IPC, so the bridge's only keychain access is the OAuth-refresh path (#55). - `--insecure` works under both runtimes (Bun via `NODE_TLS_REJECT_UNAUTHORIZED=0` on the bridge, since Bun ignores undici's TLS-bypass). - The timeout watchdog now dumps the process tree, a native `sample` backtrace, and the bridge log. - Seed `unauthorized-auto-detect`'s keychain via the test's own runtime. Green on macOS + Linux E2E (Bun & Node), unit, build, lint. Refs #248, #265 https://claude.ai/code/session_01417BuEkifr5jSSx6R2MYCB --------- Co-authored-by: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes the macOS+Bun E2E job hanging until GitHub's 6h hard kill. Root cause (pinned via a
samplebacktrace of the hung process): the Node bridge did a cross-binary native macOS Keychain read of the proxy-bearer-token the Bun CLI had written — macOS gates that with a Security access prompt that blocks forever in headless CI.--insecure/proxy don't work under Bun).timeout-minuteson every E2E job as a backstop.samplebacktrace, and the bridge log — which is what pinned this down.unauthorized-auto-detect's keychain via the test's own runtime (same per-binary reason).Verified: macOS Node + Bun E2E green; Linux Node + Bun green; unit tests, build, lint pass.
Refs #248
https://claude.ai/code/session_01417BuEkifr5jSSx6R2MYCB