perf: optimize duplicate-code-detector to reduce AIC by ~50%#5517
Conversation
There was a problem hiding this comment.
Pull request overview
This PR aims to reduce AI credit (AIC) spend and context bloat in the duplicate-code-detector agentic workflow by shrinking the precomputed inputs passed to the agent and tightening the run constraints.
Changes:
- Remove raw duplicate-code
fragmentcontent from the precomputedjscpd-top.jsonand trimstatisticsto{total, percentage}. - Reduce allowed analysis turns (≤7 → ≤4), remove an unused
jscpd containersscan, and trim grep output limits. - Regenerate the compiled lock workflow to reflect the updated
.mdworkflow source (but the lock file also includes broader version/pinning changes).
Show a summary per file
| File | Description |
|---|---|
| scripts/ci/duplicate-code-detector-workflow.test.ts | Updates assertion to the new ≤4 turns constraint. |
| .github/workflows/secret-digger-codex.lock.yml | Removes legacy env_key = "OPENAI_API_KEY" from the openai-proxy provider block. |
| .github/workflows/duplicate-code-detector.md | Shrinks jscpd JSON payload (drops fragment), removes unused containers scan, trims grep output, and updates turn limit/instructions. |
| .github/workflows/duplicate-code-detector.lock.yml | Regenerates compiled workflow to match the .md changes, but also introduces action pinning/env interpolation and version shifts. |
| .github/aw/actions-lock.json | Removes gh-aw setup action pin entries (correlates with unpinned uses: in the regenerated lock file). |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 5/5 changed files
- Comments generated: 4
| - name: Setup Scripts | ||
| id: setup | ||
| uses: github/gh-aw-actions/setup@b5cde6c5013569c8b0229dd2d7ffd63eaf2c9ad2 # v0.81.2 | ||
| uses: github/gh-aw-actions/setup@v0.80.9 | ||
| with: | ||
| destination: ${{ runner.temp }}/gh-aw/actions |
| "version": "v7.0.1", | ||
| "sha": "043fb46d1a93c77aae656e7c1c64a875d1fc6a0a" | ||
| }, | ||
| "astral-sh/setup-uv@v8.2.0": { | ||
| "repo": "astral-sh/setup-uv", | ||
| "version": "v8.2.0", | ||
| "sha": "fac544c07dec837d0ccb6301d7b5580bf5edae39" | ||
| }, | ||
| "github/gh-aw-actions/setup-cli@v0.81.2": { | ||
| "repo": "github/gh-aw-actions/setup-cli", | ||
| "version": "v0.81.2", | ||
| "sha": "b5cde6c5013569c8b0229dd2d7ffd63eaf2c9ad2" | ||
| }, | ||
| "github/gh-aw-actions/setup@v0.81.2": { | ||
| "repo": "github/gh-aw-actions/setup", | ||
| "version": "v0.81.2", | ||
| "sha": "b5cde6c5013569c8b0229dd2d7ffd63eaf2c9ad2" | ||
| } | ||
| } | ||
| } |
| # gh-aw-metadata: {"schema_version":"v4","frontmatter_hash":"7e8cc17c33fd19893924864f0a52a2bb949c66d7dc1f70dc45a231c301cfdb85","body_hash":"015029486ec6a50ed709af8371a3efcfa3d5b4ad5a30b2e269c2261b88ace348","compiler_version":"v0.80.9","strict":true,"agent_id":"copilot","agent_model":"gpt-5.4-mini","engine_versions":{"copilot":"1.0.63"}} | ||
| # gh-aw-manifest: {"version":1,"secrets":["COPILOT_GITHUB_TOKEN","GH_AW_GITHUB_MCP_SERVER_TOKEN","GH_AW_GITHUB_TOKEN","GITHUB_TOKEN"],"actions":[{"repo":"actions/cache/restore","sha":"27d5ce7f107fe9357f9df03efb73ab90386fccae","version":"v5.0.5"},{"repo":"actions/cache/restore","sha":"2c8a9bd7457de244a408f35966fab2fb45fda9c8","version":"v6.0.0"},{"repo":"actions/cache/save","sha":"2c8a9bd7457de244a408f35966fab2fb45fda9c8","version":"v6.0.0"},{"repo":"actions/checkout","sha":"9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0","version":"v7.0.0"},{"repo":"actions/download-artifact","sha":"3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c","version":"v8.0.1"},{"repo":"actions/github-script","sha":"3a2844b7e9c422d3c10d287c895573f7108da1b3","version":"v9.0.0"},{"repo":"actions/upload-artifact","sha":"043fb46d1a93c77aae656e7c1c64a875d1fc6a0a","version":"v7.0.1"},{"repo":"github/gh-aw-actions/setup","sha":"v0.80.9","version":"v0.80.9"}],"containers":[{"image":"ghcr.io/github/gh-aw-firewall/agent:0.27.7","digest":"sha256:aae231e4635c8999d039c132f1602d3df850fe9b84a00aa2b5ac981179b5661c","pinned_image":"ghcr.io/github/gh-aw-firewall/agent:0.27.7@sha256:aae231e4635c8999d039c132f1602d3df850fe9b84a00aa2b5ac981179b5661c"},{"image":"ghcr.io/github/gh-aw-firewall/api-proxy:0.27.7","digest":"sha256:009caf2e3d88fa77b64e9a03a95a228fc58db0f1701c6d324b29ba5a3c7c79b6","pinned_image":"ghcr.io/github/gh-aw-firewall/api-proxy:0.27.7@sha256:009caf2e3d88fa77b64e9a03a95a228fc58db0f1701c6d324b29ba5a3c7c79b6"},{"image":"ghcr.io/github/gh-aw-firewall/squid:0.27.7","digest":"sha256:deb1d4e19de62d51cee0508057a596a19315c3423ada4d675cad136dc8037c96","pinned_image":"ghcr.io/github/gh-aw-firewall/squid:0.27.7@sha256:deb1d4e19de62d51cee0508057a596a19315c3423ada4d675cad136dc8037c96"},{"image":"ghcr.io/github/gh-aw-mcpg:v0.3.27","digest":"sha256:fe984bddde4ec05d756d9043edb0a32912e6b7b72f6a121b1082f29221421cc7","pinned_image":"ghcr.io/github/gh-aw-mcpg:v0.3.27@sha256:fe984bddde4ec05d756d9043edb0a32912e6b7b72f6a121b1082f29221421cc7"},{"image":"ghcr.io/github/gh-aw-node","digest":"sha256:529d02eb970b1161aa25c593a9c3df57fdfad5a8add328cb3b6eccef66f3183b","pinned_image":"ghcr.io/github/gh-aw-node@sha256:529d02eb970b1161aa25c593a9c3df57fdfad5a8add328cb3b6eccef66f3183b"},{"image":"ghcr.io/github/github-mcp-server:v1.4.0","digest":"sha256:2afb26356481d1a350e14544a6e160f7f7ec1561a1ea309b823665abf0309036","pinned_image":"ghcr.io/github/github-mcp-server:v1.4.0@sha256:2afb26356481d1a350e14544a6e160f7f7ec1561a1ea309b823665abf0309036"}]} | ||
| # This file was automatically generated by gh-aw (v0.80.9). DO NOT EDIT. To debug this workflow, load the skill at https://github.com/github/gh-aw/blob/main/debug.md |
|
@copilot address review feedback |
Fixed in the latest commit. The lock file was accidentally compiled with an older compiler (v0.80.9 instead of v0.81.2), which caused the SHA pin regression, version downgrades (AWF 0.27.9→0.27.7, mcpg v0.3.29→v0.3.27), literal MCP env var placeholders, and the missing The fix restores both files to their correct pre-PR baseline and then manually re-applies only the intended content changes (remove |
|
✅ Copilot review passed with no inline comments. @copilot Add the |
|
✅ Smoke Copilot BYOK AOAI (api-key) completed. Copilot AOAI BYOK (api-key) mode operational. 🔓 |
|
📡 Smoke OTel Tracing completed. All tracing scenarios validated. ✅ |
|
✅ Contribution Check completed successfully! |
|
Chroot tests passed! Smoke Chroot - All security and functionality tests succeeded. |
|
✅ Build Test Suite completed successfully! |
|
📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤 |
|
🔌 Smoke Services — All services reachable! ✅ |
|
✅ Smoke Copilot BYOK AOAI (Entra) completed. Copilot AOAI BYOK (Entra) mode operational. 🔓 |
|
🚀 Security Guard has started processing this pull request |
|
✨ The prophecy is fulfilled... Smoke Codex has completed its mystical journey. The stars align. 🌟 |
|
✅ Smoke Claude passed |
|
✅ Smoke Gemini completed. All facets verified. 💎 |
|
✅ Smoke Copilot BYOK completed. Copilot BYOK mode operational. 🔓 |
|
🔑 Smoke Copilot PAT PAT auth validated. All systems operational. ✅ |
🔥 Smoke Test Results
PR: perf: optimize duplicate-code-detector to reduce AIC by ~50% Overall: PASS ✅
|
Smoke Test: Claude Engine Validation
Overall result: PASS
|
|
Thanks for the clear write-up and for including the workflow prompt test update. One contribution-guideline item needs attention before this is ready: under CONTRIBUTING.md → Pull request requirements, PRs should include a clear description of what the PR does. The description explains the
|
Chroot Runtime Version Comparison
|
|
Reviewed merged PRs:
Warning Firewall blocked 1 domainThe following domain was blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "registry.npmjs.org"See Network Configuration for more information.
|
|
✅ Smoke Test: Copilot BYOK (Direct) Mode — PASS
Direct BYOK mode operational. 🔓
|
🔥 Smoke Test Results — PAT Auth
Overall: PASS | Auth mode: PAT (COPILOT_GITHUB_TOKEN) PR: perf: optimize duplicate-code-detector to reduce AIC by ~50%
|
Smoke Test Results
Overall Status: FAIL Warning Firewall blocked 1 domainThe following domain was blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "localhost"See Network Configuration for more information.
|
🏗️ Build Test Suite Results
Overall: 8/8 ecosystems passed — ✅ PASS
|
Smoke Test: Services Connectivity — ❌ FAIL
|
🔍 Smoke Test: API Proxy OpenTelemetry Tracing
All 5 scenarios passed.
|
|
refactor(api-proxy): extract sliding-window data structure into rate-limiter-window.js — ✅
|
|
|
duplicate-code-detectorwas burning AIC at ~3–7× peer workflows (283.7 AIC/run at 50 AIC/min) because raw duplicate code fragments were being injected into every LLM context window via thefragmentfield injscpd-top.json.Changes
Remove
fragmentfrom jscpd JSON (biggest win: ~35–50% AIC)Drops raw duplicate code text from the pre-computed JSON fed to the LLM. The agent can use
bashto read file sections when it needs evidence. Also compressesstatisticsfrom the full object to{total, percentage}only.Reduce max turns from 7 → 4
Pre-steps already do all discovery. 4 turns is sufficient: read files → score findings → create up to 3 issues. The extra turns were allowing unnecessary code exploration.
Remove unused
jscpd containersrunThe containers scan wrote to
jscpd-src.txtbut was never referenced in the prompt — pure wasted compute.Trim grep output limits
head -40→head -20(env-var grep) andhead -30→head -20(docker exec grep). The agent scores on file/line patterns, not exhaustive line lists.Add bash evidence instruction to prompt
Explicit guidance:
Use bash to view specific file sections (e.g., sed -n 'X,Yp' src/file.ts) when writing code evidence for issues.Expected impact