docs: runner doctor update — A13, B5, B6 + portable agent A12 sync#5590
Conversation
There was a problem hiding this comment.
Pull request overview
Updates the self-hosted runner “doctor” documentation set to incorporate newly observed AWF failure modes from the latest weekly scan and to keep the portable agent copy aligned with the shared workflow guidance.
Changes:
- Added new failure modes A13 (ARC/DinD split-fs empty base-userland), B5 (cli-proxy + topology attach startup deadlock), and B6 (rootless runs leaving unreadable artifacts) to the shared failure-mode catalog and runner-doctor docs.
- Expanded error-string quick-lookup mappings and the “known unresolved” list to include A13.
- Synced the portable
.github/agents/self-hosted-runner-doctor.mdcatalog content to include A12 and the newly added A13/B5/B6 entries.
Show a summary per file
| File | Description |
|---|---|
| .github/workflows/shared/self-hosted-failure-modes.md | Adds A13/B5/B6 failure-mode rows, new error-string lookups, and marks A13 as a known unresolved item. |
| .github/workflows/self-hosted-runner-doctor.md | Updates triage hints and the known-unresolved section to include A13/B5/B6. |
| .github/agents/self-hosted-runner-doctor.md | Syncs the portable runner-doctor catalog (adds A12 + A13/B5/B6 + missing lookup rows) to align with the shared sources. |
Review details
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 3/3 changed files
- Comments generated: 3
- Review effort level: Low
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
|
✅ Copilot review passed with no inline comments. @copilot Add the |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
|
✅ Smoke Copilot BYOK completed. Copilot BYOK mode operational. 🔓 |
|
✨ The prophecy is fulfilled... Smoke Codex has completed its mystical journey. The stars align. 🌟 |
|
✅ Smoke Copilot BYOK AOAI (api-key) completed. Copilot AOAI BYOK (api-key) mode operational. 🔓 |
|
📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤 |
|
📡 Smoke OTel Tracing completed. All tracing scenarios validated. ✅ |
|
🚀 Security Guard has started processing this pull request |
|
✅ Smoke Gemini completed. All facets verified. 💎 Smoke test completed. Overall status: FAIL due to connectivity issues. Label 'smoke-gemini' was not added. |
|
❌ Contribution Check failed. Please review the logs for details. |
|
Chroot tests passed! Smoke Chroot - All security and functionality tests succeeded. |
|
✅ Build Test Suite completed successfully! |
|
✅ Smoke Copilot BYOK AOAI (Entra) completed. Copilot AOAI BYOK (Entra) mode operational. 🔓 |
|
✅ Smoke Claude passed |
|
🔌 Smoke Services — All services reachable! ✅ |
|
🔑 Smoke Copilot PAT PAT auth validated. All systems operational. ✅ |
🔬 Smoke Test ResultsPR: docs: runner doctor update — A13, B5, B6 + portable agent A12 sync
Overall: PARTIAL PASS — MCP ✅, pre-step outputs unavailable (template vars not expanded in workflow)
|
Smoke Test: Copilot BYOK (Direct) Mode ✅ PASSTests:
Mode: Direct BYOK (COPILOT_PROVIDER_API_KEY via api-proxy sidecar) Assignees:
|
Smoke Test: Claude Engine Validation
Overall result: PASS
|
🔬 Smoke Test: PAT Auth — PR #5590
Overall: PASS (2/2 verifiable tests passed; file test skipped — template vars unresolved) Auth mode: PAT (COPILOT_GITHUB_TOKEN) | PR by
|
|
Merged PRs:
Results:
Overall: PASS Warning Firewall blocked 1 domainThe following domain was blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "registry.npmjs.org"See Network Configuration for more information.
|
🔭 Smoke Test: API Proxy OpenTelemetry Tracing
Details:
All scenarios pass. ✅
|
🔍 Chroot Version Comparison Results
Overall: ❌ Not all tests passed — Python and Node.js versions differ between host and chroot environments.
|
|
Smoke test results:
Running in direct BYOK mode (AWF_AUTH_TYPE=github-oidc + AWF_AUTH_AZURE_* + COPILOT_PROVIDER_BASE_URL) via api-proxy → Azure OpenAI (Foundry, o4-mini-aw) authenticated via Microsoft Entra Overall: PASS
|
🏗️ Build Test Suite Results
Overall: 8/8 ecosystems passed — ✅ PASS
|
|
Running in direct BYOK mode (COPILOT_PROVIDER_API_KEY + COPILOT_PROVIDER_BASE_URL) via api-proxy → Azure OpenAI (Foundry, o4-mini-aw) Overall: PASS
|
Smoke Test: Services Connectivity
Overall: FAIL
|
Smoke Test: Gemini Engine ValidationPR Titles:
Test Results:
Overall Status: FAIL Warning Firewall blocked 1 domainThe following domain was blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "localhost"See Network Configuration for more information.
|
Weekly runner doctor scan (2026-06-19 → 2026-06-26) identified 3 new failure modes and a sync gap in the portable agent file.
New failure modes
chroot: failed to run command '/bin/sh'on a glibc/Debian DinD daemon: staging dirs (/tmp/gh-aw/{usr,bin,...}) are empty becausestageBaseSystem()is not yet implemented. The "musl/Alpine" entrypoint warning is a red herring. Unresolved — workaround is baking binaries into the DinD daemon image.EAI_AGAIN <awmg-cli-proxy>deadlock in--network-isolation+--topology-attach:connectTopologyContainers()runs afterstartContainers(), but the cli-proxy health gate blocks on the topology peer that hasn't been attached yet. Deterministic, not flaky. Fixed in AWF.EACCESonupload-artifactafter rootless (sudo: false) AWF runs: squid/cli-proxy/agent sidecars write files as non-runner UIDs;chmod -R a+rXsilently fails atdebuglevel. Fixed in AWF.Files changed
shared/self-hosted-failure-modes.md— A13 row (Category A), B5/B6 rows (Category B), 3 new error-string lookup entries, A13 added to known-unresolved listself-hosted-runner-doctor.md— §3 hint lines for A13/B5/B6; §4 expanded with A13 unresolved entry.github/agents/self-hosted-runner-doctor.md(portable, self-contained) — synced missing A12 (mkdiratread-only fs, fixed in v0.27.10), added A13/B5/B6, added 4 missing error-string rows, updated §3 hints and §4 unresolved to match