Skip to content

⚡ Copilot Token Optimization2026-06-26 — Contribution Check #5558

Description

@github-actions

Target Workflow: contribution-check

Source report: #5556
Estimated cost per run: N/A (token telemetry unavailable — see Recommendation #1)
Total tokens per run: N/A (all runs report null token_usage)
AIC (proxy metric): ~51.6 avg / 154.7 total across 3 runs
Action minutes: 5 min/run
Run duration: ~4.4 min avg
GitHub API calls: ~5.3 avg/run (despite prompt forbidding them — see below)
LLM turns: N/A (token telemetry unavailable)

⚠️ Note: All 50 Copilot workflow runs in the reporting period have null token_usage. The rankings use AIC (Actions Intelligence Cost) as a proxy. Contribution Check ranks #1 by total AIC.

Current Configuration

Setting Value
Model gpt-5.4-mini
Max turns 5
GitHub toolsets loaded pull_requests
Network groups github only ✅
Pre-agent steps Yes (3 steps: CONTRIBUTING.md, PR diff, PR metadata) ✅
Prompt forbids GitHub tool use Yes ✅
Strict mode false
Prompt size ~4,874 chars

What's already well-optimized:

  • Uses cheapest model (gpt-5.4-mini)
  • Pre-fetches all data in steps: so agent can read files directly
  • Restricts GitHub toolset to pull_requests (not full ~22-tool default)
  • Single network group
  • Prompt explicitly instructs agent not to call gh, git, or GitHub API

What the data reveals: Runs average ~5.3 GitHub API calls despite the prompt saying not to make them. This indicates the agent is still invoking pull_requests toolset tools (e.g., get_pull_request, list_pull_request_files) even though all data is pre-fetched. The strict: false setting allows this drift.

Recommendations

1. Fix Token Telemetry Export (Prerequisite)

Impact: Required to measure all other optimizations

All 50 Copilot workflow runs in the current reporting period report null token_usage. Without telemetry, it's impossible to measure the impact of any change.

Investigation steps:

  • Check the api-proxy container logs for telemetry export errors
  • Verify the api-proxy is writing token usage to the expected location
  • Check if AWF_REFLECT_ENABLED: 1 is correctly hooked up to token data capture
  • Compare the copilot-logs.json schema against what the api-proxy actually emits

Until telemetry is fixed, all savings estimates below are structural approximations based on tool schema sizes (~500–700 tokens/tool).


2. Remove tools.github Block Entirely

Estimated savings: ~3,000–5,000 tokens/run (~5–8 tool schemas × 600 tokens each)

The pull_requests toolset loads approximately 5–8 GitHub MCP tools (e.g., get_pull_request, list_pull_request_files, create_pull_request_review, merge_pull_request, etc.). These schemas are injected into every LLM turn, yet the agent is explicitly told never to use them.

The add_comment safe-output is a separate safeoutputs CLI tool — it does not depend on the GitHub MCP server being loaded.

Current .github/workflows/contribution-check.md:

tools:
  github:
    mode: gh-proxy
    toolsets: [pull_requests]

Proposed change: Remove the entire tools: block:

# (remove tools: block entirely)

Side effects to verify:

  • Run gh aw compile .github/workflows/contribution-check.md and confirm no compile errors
  • Confirm github-mcp-server container no longer appears in the lock file's containers: list
  • Run a test PR through the workflow and verify add_comment still works (it should — it's safeoutputs, not MCP)

This change also eliminates the ~5.3 avg GitHub API calls per run that the current data shows the agent making despite the prohibition.


3. Reduce max-turns from 5 to 3

Estimated savings: Up to 40% reduction in maximum token consumption per run

The task is linear: read 3 pre-fetched files → compare against guidelines → call add_comment or noop. This does not require iteration or multi-step research. Five turns allows the agent to over-explore, which the ~5.3 GitHub API call average suggests is happening.

Change in .github/workflows/contribution-check.md:

# Before
max-turns: 5

# After
max-turns: 3

Rationale: Turn 1 = read files, Turn 2 = analysis + safe-output call, Turn 3 = fallback if retry needed. Three turns is a comfortable upper bound.


4. Enable strict: true

Estimated savings: Prevents multi-turn exploration; reduces wasted turns

With strict: false, the agent can make open-ended tool calls and explore freely. With strict: true, the agent must commit to a plan and execute it without deviation. For a workflow that has all data pre-fetched and only needs to read files + call one safe-output, strict mode is appropriate.

Change in .github/workflows/contribution-check.md:

# Before
strict: false

# After
strict: true

Note: Test this change on a PR before merging — strict mode may occasionally cause the agent to reject ambiguous inputs. The prompt is already well-structured so this should be low-risk.


5. Tighten the Prompt's File-Reading Instruction

Estimated savings: Minor (reduces hedging turns)

The current prompt says:

"Read the following pre-fetched context files before proceeding"

Consider making this more imperative to reduce turns where the agent re-reads or re-checks:

## Context (Pre-Fetched — Read These First)

The following files have been pre-fetched for you. Read all three before making any assessment:
- `/tmp/gh-aw/contribution-check-context/pr-meta.md`
- `/tmp/gh-aw/contribution-check-context/pr-files.md`
- `/tmp/gh-aw/contribution-check-context/contributing.md`

**Do not call any tools other than `add_comment` or `noop`.**

The explicit "Do not call any tools other than..." instruction reinforces the behavioral constraint at the prompt level.

Expected Impact

Metric Current Projected Savings
GitHub API calls/run ~5.3 ~0 ~100%
Max turns 5 3 -40%
Tool schemas in context ~5–8 tools 0 -3K–5K tokens
Run duration ~4.4 min ~2.5–3 min (est.) ~30–40%
AIC/run ~51.6 ~30–35 (est.) ~35–40%

Savings are estimates pending telemetry fix (Recommendation #1).

Implementation Checklist

  • Fix token telemetry — investigate why token_usage is null for all Copilot runs
  • Remove tools: block from contribution-check.md
  • Change max-turns: 5max-turns: 3
  • Change strict: falsestrict: true
  • Update prompt with explicit "no tools" instruction
  • Recompile: gh aw compile .github/workflows/contribution-check.md
  • Post-process if needed: npx tsx scripts/ci/postprocess-smoke-workflows.ts (check if applicable)
  • Open a test PR and verify the workflow produces correct output
  • Compare AIC and GitHub API call count on new run vs baseline
  • Once telemetry is restored, compare token/cost metrics

Generated by Daily Copilot Token Optimization Advisor · 63.4 AIC · ⊞ 6.8K ·

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions