Skip to content

Commit 6fdfef6

Browse files
authored
Merge pull request #188 from codelion/feat-add-deep-think-plugin
init
2 parents 975e66e + ceb9672 commit 6fdfef6

File tree

8 files changed

+1371
-2
lines changed

8 files changed

+1371
-2
lines changed

optillm/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
import os
33

44
# Version information
5-
__version__ = "0.1.12"
5+
__version__ = "0.1.13"
66

77
# Get the path to the root optillm.py
88
spec = util.spec_from_file_location(

optillm/plugins/deepthink/README.md

Lines changed: 136 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,136 @@
1+
# Deep Think Plugin
2+
3+
## Overview
4+
5+
The Deep Think plugin combines two powerful approaches for enhanced reasoning in large language models:
6+
7+
1. **SELF-DISCOVER Framework**: A method where LLMs self-discover task-intrinsic reasoning structures by selecting, adapting, and implementing atomic reasoning modules into a coherent reasoning plan.
8+
9+
2. **Uncertainty-Routed Chain-of-Thought**: An approach that generates multiple chain-of-thought samples, evaluates confidence through consistency, and routes to either majority voting (high confidence) or greedy decoding (low confidence).
10+
11+
## Key Features
12+
13+
- **Adaptive Reasoning Structure**: Automatically discovers the best reasoning approach for each specific task
14+
- **Confidence-Based Routing**: Uses uncertainty estimation to decide between multiple samples or single greedy output
15+
- **Reasoning Model Support**: Designed for models that produce structured thinking in `<think></think>` tags
16+
- **Multiple Sampling**: Generates multiple reasoning paths and selects the most reliable one
17+
18+
## How It Works
19+
20+
### Stage 1: SELF-DISCOVER Reasoning Structure
21+
22+
1. **SELECT**: From 39 atomic reasoning modules, select those most relevant for the task
23+
2. **ADAPT**: Rephrase selected modules to be task-specific
24+
3. **IMPLEMENT**: Create a structured JSON reasoning plan
25+
26+
### Stage 2: Uncertainty-Routed Generation
27+
28+
1. **Multiple Sampling**: Generate n samples (default: 3) using the discovered structure
29+
2. **Confidence Evaluation**: Assess consistency across samples
30+
3. **Route Decision**:
31+
- High confidence → Use majority vote
32+
- Low confidence → Use greedy sample (temperature=0)
33+
34+
## Usage
35+
36+
```python
37+
# Via optillm model prefix
38+
model = "deepthink-your-model-name"
39+
40+
# Via optillm_approach in request
41+
{
42+
"model": "your-model-name",
43+
"optillm_approach": "deepthink",
44+
"messages": [...],
45+
"deepthink_samples": 3, # Number of samples for uncertainty routing
46+
"confidence_threshold": 0.7, # Threshold for majority vs greedy routing
47+
"max_tokens": 16382, # Extended context for reasoning
48+
"temperature": 0.7, # Default temperature for sampling
49+
"top_p": 0.95 # Default top_p for sampling
50+
}
51+
```
52+
53+
## Configuration Parameters
54+
55+
- `deepthink_samples` (int, default=3): Number of reasoning samples to generate
56+
- `confidence_threshold` (float, default=0.7): Confidence threshold for routing decision
57+
- `max_tokens` (int, default=16382): Maximum tokens for generation
58+
- `temperature` (float, default=0.7): Sampling temperature
59+
- `top_p` (float, default=0.95): Top-p sampling parameter
60+
- `enable_self_discover` (bool, default=True): Whether to use SELF-DISCOVER structure
61+
- `reasoning_modules_limit` (int, default=5): Max reasoning modules to select
62+
63+
## Atomic Reasoning Modules
64+
65+
The plugin includes 39 reasoning modules covering:
66+
- Critical thinking and analysis
67+
- Creative and innovative approaches
68+
- Systems thinking and holistic analysis
69+
- Risk assessment and evaluation
70+
- Step-by-step decomposition
71+
- Collaborative and perspective-taking approaches
72+
- Reflective and meta-cognitive strategies
73+
74+
## Examples
75+
76+
### Mathematical Problem Solving
77+
Input: "Solve: If a train travels 120 miles in 2 hours, how long will it take to travel 300 miles?"
78+
79+
The plugin will:
80+
1. Discover a reasoning structure focused on rate calculations
81+
2. Generate multiple solution paths
82+
3. Evaluate consistency and select the most reliable answer
83+
84+
### Complex Reasoning Task
85+
Input: "Analyze the potential long-term economic impacts of remote work adoption"
86+
87+
The plugin will:
88+
1. Select reasoning modules like systems thinking, risk analysis, and critical thinking
89+
2. Create a structured analysis plan
90+
3. Generate multiple perspectives and synthesize the most coherent analysis
91+
92+
## Implementation Details
93+
94+
- **Reasoning Extraction**: Automatically extracts content from `<think></think>` tags
95+
- **Consistency Scoring**: Uses multiple metrics including answer similarity and reasoning coherence
96+
- **Adaptive Thresholds**: Can be fine-tuned based on model performance
97+
- **Token Efficiency**: Optimized to minimize redundant computation while maximizing reasoning quality
98+
99+
## Performance
100+
101+
The Deep Think approach has shown significant improvements on complex reasoning tasks, with particularly strong results on mathematical competition problems.
102+
103+
### AIME 2025 Results
104+
105+
| Model | Approach | Accuracy | Improvement |
106+
|-------|----------|----------|-------------|
107+
| qwen-3-32b | Baseline | 43.33% | - |
108+
| qwen-3-32b | Deep Think | **63.33%** | **+20.00pp** |
109+
110+
*Experimental settings: max_completion_tokens=16382, temperature=0.7, top_p=0.95*
111+
112+
**Key Findings:**
113+
- **46% relative improvement** over baseline on mathematical reasoning
114+
- **Cerebras inference** was crucial for enabling high inference-time compute without latency penalty
115+
- The combination of SELF-DISCOVER structure discovery and uncertainty-routed sampling proved particularly effective for competition mathematics
116+
- Enhanced accuracy on multi-step problems requiring systematic reasoning
117+
118+
### Other Improvements
119+
120+
The Deep Think approach has also demonstrated:
121+
- Enhanced accuracy on multi-step problems
122+
- Better handling of ambiguous or open-ended questions
123+
- Improved consistency across different problem types
124+
- Reduced hallucination through confidence-based routing
125+
126+
## Limitations
127+
128+
- Increased computational cost due to multiple sampling
129+
- Longer response times for complex reasoning tasks
130+
- Requires models capable of structured thinking output
131+
- May over-engineer solutions for simple problems
132+
133+
## References
134+
135+
- Zhou, P. et al. "SELF-DISCOVER: Large Language Models Self-Compose Reasoning Structures" (2024)
136+
- Uncertainty-routed chain-of-thought approaches in advanced reasoning systems

optillm/plugins/deepthink/__init__.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
"""
2+
Deep Think Plugin for OptILM
3+
4+
A plugin that combines SELF-DISCOVER framework with uncertainty-routed
5+
chain-of-thought for enhanced reasoning capabilities.
6+
"""
Lines changed: 234 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,234 @@
1+
"""
2+
Atomic Reasoning Modules for SELF-DISCOVER Framework
3+
4+
This module contains the 39 reasoning modules as described in the SELF-DISCOVER paper.
5+
These modules represent high-level cognitive heuristics for problem-solving.
6+
"""
7+
8+
# 39 Atomic Reasoning Modules from SELF-DISCOVER paper
9+
REASONING_MODULES = [
10+
{
11+
"id": 1,
12+
"name": "experimental_design",
13+
"description": "How could I devise an experiment to help solve that problem?"
14+
},
15+
{
16+
"id": 2,
17+
"name": "iterative_problem_solving",
18+
"description": "Make a list of ideas for solving this problem, and apply them one by one to the problem to see if any progress can be made."
19+
},
20+
{
21+
"id": 3,
22+
"name": "progress_measurement",
23+
"description": "How could I measure progress on this problem?"
24+
},
25+
{
26+
"id": 4,
27+
"name": "problem_simplification",
28+
"description": "How can I simplify the problem so that it is easier to solve?"
29+
},
30+
{
31+
"id": 5,
32+
"name": "assumption_analysis",
33+
"description": "What are the key assumptions underlying this problem?"
34+
},
35+
{
36+
"id": 6,
37+
"name": "risk_assessment",
38+
"description": "What are the potential risks and drawbacks of each solution?"
39+
},
40+
{
41+
"id": 7,
42+
"name": "perspective_analysis",
43+
"description": "What are the alternative perspectives or viewpoints on this problem?"
44+
},
45+
{
46+
"id": 8,
47+
"name": "long_term_implications",
48+
"description": "What are the long-term implications of this problem and its solutions?"
49+
},
50+
{
51+
"id": 9,
52+
"name": "problem_decomposition",
53+
"description": "How can I break down this problem into smaller, more manageable parts?"
54+
},
55+
{
56+
"id": 10,
57+
"name": "critical_thinking",
58+
"description": "Critical Thinking: This style involves analyzing the problem from different perspectives, questioning assumptions, and evaluating the evidence or information available. It focuses on logical reasoning, evidence-based decision-making, and identifying potential biases or flaws in thinking."
59+
},
60+
{
61+
"id": 11,
62+
"name": "creative_thinking",
63+
"description": "Try creative thinking, generate innovative and out-of-the-box ideas to solve the problem. Explore unconventional solutions, thinking beyond traditional boundaries, and encouraging imagination and originality."
64+
},
65+
{
66+
"id": 12,
67+
"name": "collaborative_thinking",
68+
"description": "Seek input and collaboration from others to solve the problem. Emphasize teamwork, open communication, and leveraging the diverse perspectives and expertise of a group to come up with effective solutions."
69+
},
70+
{
71+
"id": 13,
72+
"name": "systems_thinking",
73+
"description": "Use systems thinking: Consider the problem as part of a larger system and understanding the interconnectedness of various elements. Focus on identifying the underlying causes, feedback loops, and interdependencies that influence the problem, and developing holistic solutions that address the system as a whole."
74+
},
75+
{
76+
"id": 14,
77+
"name": "risk_analysis",
78+
"description": "Use Risk Analysis: Evaluate potential risks, uncertainties, and tradeoffs associated with different solutions or approaches to a problem. Emphasize assessing the potential consequences and likelihood of success or failure, and making informed decisions based on a balanced analysis of risks and benefits."
79+
},
80+
{
81+
"id": 15,
82+
"name": "reflective_thinking",
83+
"description": "Use Reflective Thinking: Step back from the problem, take the time for introspection and self-reflection. Examine personal biases, assumptions, and mental models that may influence problem-solving, and being open to learning from past experiences to improve future approaches."
84+
},
85+
{
86+
"id": 16,
87+
"name": "core_issue_identification",
88+
"description": "What is the core issue or problem that needs to be addressed?"
89+
},
90+
{
91+
"id": 17,
92+
"name": "causal_analysis",
93+
"description": "What are the underlying causes or factors contributing to the problem?"
94+
},
95+
{
96+
"id": 18,
97+
"name": "historical_analysis",
98+
"description": "Are there any potential solutions or strategies that have been tried before? If yes, what were the outcomes and lessons learned?"
99+
},
100+
{
101+
"id": 19,
102+
"name": "obstacle_identification",
103+
"description": "What are the potential obstacles or challenges that might arise in solving this problem?"
104+
},
105+
{
106+
"id": 20,
107+
"name": "data_analysis",
108+
"description": "Are there any relevant data or information that can provide insights into the problem? If yes, what data sources are available, and how can they be analyzed?"
109+
},
110+
{
111+
"id": 21,
112+
"name": "stakeholder_analysis",
113+
"description": "Are there any stakeholders or individuals who are directly affected by the problem? What are their perspectives and needs?"
114+
},
115+
{
116+
"id": 22,
117+
"name": "resource_analysis",
118+
"description": "What resources (financial, human, technological, etc.) are needed to tackle the problem effectively?"
119+
},
120+
{
121+
"id": 23,
122+
"name": "success_metrics",
123+
"description": "How can progress or success in solving the problem be measured or evaluated?"
124+
},
125+
{
126+
"id": 24,
127+
"name": "metric_identification",
128+
"description": "What indicators or metrics can be used?"
129+
},
130+
{
131+
"id": 25,
132+
"name": "problem_type_technical",
133+
"description": "Is the problem a technical or practical one that requires a specific expertise or skill set? Or is it more of a conceptual or theoretical problem?"
134+
},
135+
{
136+
"id": 26,
137+
"name": "physical_constraints",
138+
"description": "Does the problem involve a physical constraint, such as limited resources, infrastructure, or space?"
139+
},
140+
{
141+
"id": 27,
142+
"name": "behavioral_aspects",
143+
"description": "Is the problem related to human behavior, such as a social, cultural, or psychological issue?"
144+
},
145+
{
146+
"id": 28,
147+
"name": "decision_making",
148+
"description": "Does the problem involve decision-making or planning, where choices need to be made under uncertainty or with competing objectives?"
149+
},
150+
{
151+
"id": 29,
152+
"name": "analytical_problem",
153+
"description": "Is the problem an analytical one that requires data analysis, modeling, or optimization techniques?"
154+
},
155+
{
156+
"id": 30,
157+
"name": "design_challenge",
158+
"description": "Is the problem a design challenge that requires creative solutions and innovation?"
159+
},
160+
{
161+
"id": 31,
162+
"name": "systemic_issues",
163+
"description": "Does the problem require addressing systemic or structural issues rather than just individual instances?"
164+
},
165+
{
166+
"id": 32,
167+
"name": "time_sensitivity",
168+
"description": "Is the problem time-sensitive or urgent, requiring immediate attention and action?"
169+
},
170+
{
171+
"id": 33,
172+
"name": "typical_solutions",
173+
"description": "What kinds of solution typically are produced for this kind of problem specification?"
174+
},
175+
{
176+
"id": 34,
177+
"name": "alternative_solutions",
178+
"description": "Given the problem specification and the current best solution, have a guess about other possible solutions."
179+
},
180+
{
181+
"id": 35,
182+
"name": "radical_rethinking",
183+
"description": "Let's imagine the current best solution is totally wrong, what other ways are there to think about the problem specification?"
184+
},
185+
{
186+
"id": 36,
187+
"name": "solution_modification",
188+
"description": "What is the best way to modify this current best solution, given what you know about these kinds of problem specification?"
189+
},
190+
{
191+
"id": 37,
192+
"name": "novel_solution",
193+
"description": "Ignoring the current best solution, create an entirely new solution to the problem."
194+
},
195+
{
196+
"id": 38,
197+
"name": "step_by_step",
198+
"description": "Let's think step by step."
199+
},
200+
{
201+
"id": 39,
202+
"name": "step_by_step_plan",
203+
"description": "Let's make a step by step plan and implement it with good notion and explanation."
204+
}
205+
]
206+
207+
def get_all_modules():
208+
"""Return all 39 reasoning modules."""
209+
return REASONING_MODULES
210+
211+
def get_modules_by_category():
212+
"""Categorize modules by their primary focus."""
213+
categories = {
214+
"analytical": [1, 3, 5, 10, 14, 17, 20, 23, 24, 25, 29],
215+
"creative": [2, 4, 11, 30, 34, 35, 37],
216+
"systematic": [9, 13, 16, 18, 22, 31, 33, 36, 38, 39],
217+
"collaborative": [7, 12, 15, 21],
218+
"risk_oriented": [6, 8, 14, 19],
219+
"behavioral": [27, 28],
220+
"constraint_focused": [26, 32]
221+
}
222+
223+
return {
224+
category: [REASONING_MODULES[i-1] for i in indices]
225+
for category, indices in categories.items()
226+
}
227+
228+
def get_modules_by_ids(module_ids):
229+
"""Get specific modules by their IDs."""
230+
return [module for module in REASONING_MODULES if module["id"] in module_ids]
231+
232+
def get_module_descriptions():
233+
"""Get just the descriptions for prompting."""
234+
return [f"{module['name']}: {module['description']}" for module in REASONING_MODULES]

0 commit comments

Comments
 (0)