Imagine the LLM's context window as a backpack with limited capacity 🎒. Every conversation turn, every tool call result adds something to the backpack. As the conversation goes on, the backpack gets fuller and fuller...
Context management is a set of mechanisms that help you "manage your backpack", ensuring the AI can work continuously and efficiently.
The context management mechanism is inspired by OpenClaw and independently implemented via LightContextManager in QwenPaw.
QwenPaw context management uses two parallel offload paths to handle the limited context window:
| Mechanism | Triggered When | Offload Target | What Stays in Context |
|---|---|---|---|
| Tool result offload | Tool output exceeds byte threshold | tool_result/{uuid}.txt |
Snippet + file path reference |
| Conversation compact + archive | Context token count exceeds threshold | dialog/YYYY-MM-DD.jsonl |
compact_summary (summary + path guide) |
Before each inference turn, MemoryCompactionHook runs in order:
flowchart LR
A[Before each turn] --> B[1 Tool result offload]
B --> C[2 Token threshold check]
C -->|Under limit| D[Proceed normally]
C -->|Over limit| E[3 Compact old messages\ngenerate compact_summary]
E --> F[4 Archive raw messages\nwrite to dialog/]
F --> D
- No data loss: Compacted raw conversations are saved in
dialog/, tool outputs intool_result/— the Agent can always retrieve them viaread_file - Context continuity:
compact_summaryretains a structured summary + dialog path guide so the Agent never loses context - Automatic: Triggers without manual intervention;
/compactcan also trigger it manually
QwenPaw's context consists of two components:
flowchart TD
A[Context] --> B[compact_summary optional]
B --> C[Dialog path guide<br>dialog/YYYY-MM-DD.jsonl N lines]
B --> D[Structured history summary<br>Goal / Constraints / Progress<br>KeyDecisions / NextSteps]
A --> E[messages full dialogue history]
| Component | Description |
|---|---|
| compact_summary | Generated after compaction; contains two parts (see below) |
| ↳ Dialog path guide | Points to raw conversation data in dialog/YYYY-MM-DD.jsonl for reference |
| ↳ Structured history summary | Goal / Constraints / Progress / KeyDecisions / NextSteps |
| messages | Current conversation context (full message list) |
Data evicted from the context is offloaded to the file system, keeping it traceable:
| Path | Contents |
|---|---|
dialog/YYYY-MM-DD.jsonl |
Compacted raw conversation messages, appended chronologically |
tool_result/{uuid}.txt |
Full text of long tool call results; auto-cleaned after N days |
graph LR
A[System Prompt] -->|Always retained| B[Compactable Zone<br>Compactable Messages]
B -->|Compress when exceeded| C[Reserved Zone<br>Recent Messages]
| Zone | Description | Handling |
|---|---|---|
| System Prompt | The AI's "role definition" and base instructions | Always retained, never compacted |
| Compactable Zone | Historical conversation messages | Token counted; compacted into summary when threshold exceeded |
| Reserved Zone | Most recent N messages | Kept as-is, ensuring context continuity |
┌─────────────────────────────────────────┐
│ System Prompt (Fixed) │ ← Always retained
│ "You are an AI assistant..." │
├─────────────────────────────────────────┤
│ compact_summary (Optional) │ ← Generated after compaction
│ - [Dialog guide] dialog/2025-01-15.jsonl│
│ - Goal: Build user login system │
│ - Progress: Login API completed... │
├─────────────────────────────────────────┤
│ Compactable Zone │ ← Compacted when exceeded
│ [Message 1] User: Help me build login │
│ [Message 2] Assistant: Sure, I'll... │
│ [Message 3] Tool call result... │
│ ... │
├─────────────────────────────────────────┤
│ Reserved Zone │ ← Always retained
│ [Message N-2] User: Add registration │
│ [Message N-1] Assistant: Sure... │
│ [Message N] User: Done! │
└─────────────────────────────────────────┘
graph LR
Agent[Agent] -->|Before each inference| Hook[MemoryCompactionHook]
Hook --> TC[compact_tool_result<br>Compress tool output]
TC --> CC[check_context<br>Token counting]
CC -->|Exceeds limit| CM[compact_memory<br>Generate summary]
- LightContextManager
- AsMsgHandler — Context checking and message formatting
- compactor_prompts — Compaction prompts
flowchart LR
M[messages] --> TC[ToolCallResultCompact<br>Offload long tool outputs]
TC --> CC[ContextChecker<br>Token counting]
CC --> D{Token > Threshold?}
D -->|No| K[Proceed normally]
D -->|Yes| E[Keep recent X% tokens]
E --> CM[Compactor<br>Compact old messages into summary]
CM --> SD[SaveDialog<br>Offload compacted messages to<br>dialog/YYYY-MM-DD.jsonl]
SD --> R[Update compact_summary + clear old messages]
Execution Order:
ToolCallResultCompact— Offload long tool outputs totool_result/(if enabled)ContextChecker— Determine if token count exceeds thresholdCompactor— Compress old messages into a structured summary (compact_memory)SaveDialog— Persist the compacted raw messages todialog/YYYY-MM-DD.jsonl
When the context approaches its limit, QwenPaw automatically triggers compaction, condensing old conversations into a structured summary.
When tool_result_pruning_config.enabled is on (default true), different byte thresholds are applied based on how recent a message is:
flowchart LR
A[Tool Call Result] --> B{Within pruning_recent_n?}
B -->|Yes| C[Low truncation<br>pruning_recent_msg_max_bytes<br>Save full text to tool_result/uuid.txt<br>Keep snippet + file ref in message]
B -->|No| D[High truncation<br>pruning_old_msg_max_bytes<br>Reference existing file<br>More aggressive truncation]
C --> E[Context]
D --> E
| Message type | Threshold | Default | Behavior |
|---|---|---|---|
Most recent pruning_recent_n |
pruning_recent_msg_max_bytes |
50000 |
Preserve more content; save full text to file |
| Older messages | pruning_old_msg_max_bytes |
3000 |
Aggressive truncation; reuse existing file path |
Tool-specific behavior:
- Browser-use type tools: On first call, full content is saved to
tool_result/uuid.txt, message keeps snippet + file reference with a "read from line N" hint; secondary truncation applies once the message falls outsidepruning_recent_n - read_file tool: No truncation or file save within
pruning_recent_n(content is already an external file); beyondpruning_recent_n, truncated and saved totool_result/ - Files older than
offload_retention_daysare automatically cleaned up
Determines if context exceeds limits based on token counting, automatically splitting messages into "to compact" and "to keep" groups.
graph LR
M[messages] --> H[Token counting]
H --> C{total > threshold?}
C -->|No| K[Return all messages]
C -->|Yes| S[Keep from tail backwards<br>reserve tokens]
S --> CP[messages_to_compact<br>Early messages]
S --> KP[messages_to_keep<br>Recent messages]
S --> V{is_valid<br>Tool call alignment?}
- Core Logic: Reserve
memory_compact_reservetokens from the tail backwards, marking excess as to-be-compacted - Integrity Guarantee: Does not split user-assistant conversation pairs or tool_use/tool_result pairs
Uses ReActAgent to compress historical conversations into a structured context summary:
graph LR
M[messages] --> H[format_msgs_to_str]
H --> A[ReActAgent<br>reme_compactor]
P[previous_summary] -->|Incremental update| A
A --> S[Structured summary]
Proactively trigger compaction:
/compact
You can also add an optional instruction for this manual run:
/compact keep requirements and decisions only
After execution, you'll see:
**Compact Complete!**
- Messages compacted: 12
**Compressed Summary:**
<compacted summary content>
Response breakdown:
- 📊 Messages compacted - How many messages were compacted
- 📝 Compressed Summary - The generated summary content
compact_summary consists of two parts: a dialog path guide and a structured history summary.
Points to compacted raw conversation data in dialog/YYYY-MM-DD.jsonl (written chronologically; recommended to read from the end backwards). The Agent can use the read_file tool to review historical details without keeping raw messages in the active context.
graph TB
A[Structured History Summary] --> B[Goal]
A --> C[Constraints]
A --> D[Progress]
A --> E[Key Decisions]
A --> F[Next Steps]
A --> G[Critical Context]
| Field | Content | Example |
|---|---|---|
| Goal | What the user wants to accomplish | "Build a user login system" |
| Constraints | Requirements and preferences | "Use TypeScript, no frameworks" |
| Progress | Completed / in-progress / blocked tasks | "Login API done, registration API in progress" |
| Key Decisions | Decisions made and their rationale | "Chose JWT over Sessions for statelessness" |
| Next Steps | What to do next | "Implement password reset feature" |
| Critical Context | Data needed to continue work | "Main file is at src/auth.ts" |
- Incremental Update: When
previous_summaryis provided, new conversations are automatically merged with the old summary - Information Preservation: Compaction preserves exact file paths, function names, and error messages, ensuring seamless context transitions
Configuration is located in ~/.qwenpaw/workspaces/{agent_id}/agent.json under agents.running:
running top-level fields:
| Parameter | Default | Description |
|---|---|---|
max_input_length |
131072 |
Model context window size (tokens) |
context_manager_backend |
"light" |
Context manager backend type |
memory_manager_backend |
"remelight" |
Memory manager backend type |
running.light_context_config fields:
| Parameter | Default | Description |
|---|---|---|
dialog_path |
"dialog" |
Dialog persistence directory (relative to working dir) |
token_count_estimate_divisor |
4.0 |
Divisor for byte-based token estimation |
running.light_context_config.context_compact_config fields:
| Parameter | Default | Description |
|---|---|---|
enabled |
true |
Whether to enable automatic context compaction |
compact_threshold_ratio |
0.8 |
Threshold ratio for triggering compaction, triggers when max_input_length × ratio is reached |
reserve_threshold_ratio |
0.1 |
Ratio of recent messages to keep during compaction, keeps max_input_length × ratio tokens |
running.light_context_config.tool_result_pruning_config fields:
| Parameter | Default | Description |
|---|---|---|
enabled |
true |
Whether to prune long tool outputs |
pruning_recent_n |
2 |
Number of recent messages to use higher threshold for |
pruning_old_msg_max_bytes |
3000 |
Byte threshold for older tool result messages |
pruning_recent_msg_max_bytes |
50000 |
Byte threshold for the most recent pruning_recent_n tool result messages |
offload_retention_days |
5 |
Days to retain cached tool output files (auto-cleaned after expiry) |
Calculation Relationships:
memory_compact_threshold=max_input_length × compact_threshold_ratio(threshold for triggering compaction)memory_compact_reserve=max_input_length × reserve_threshold_ratio(tokens of recent messages to keep)
Example Configuration:
{
"agents": {
"running": {
"max_input_length": 128000,
"context_manager_backend": "light",
"light_context_config": {
"dialog_path": "dialog",
"context_compact_config": {
"enabled": true,
"compact_threshold_ratio": 0.8,
"reserve_threshold_ratio": 0.1
},
"tool_result_pruning_config": {
"enabled": true,
"pruning_recent_n": 2,
"pruning_old_msg_max_bytes": 3000,
"pruning_recent_msg_max_bytes": 50000
}
}
}
}
}