Skip to content

Commit 0297900

Browse files
authored
Merge pull request #216 from codelion/feat-deep-research
Feat deep research
2 parents aabd5f1 + 4f3170f commit 0297900

File tree

60 files changed

+13446
-9
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

60 files changed

+13446
-9
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -171,3 +171,4 @@ cython_debug/
171171
scripts/results/
172172
results/
173173
test_results.json
174+
deep_research_reports/

README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -378,6 +378,8 @@ Check this log file for connection issues, tool execution errors, and other diag
378378
| Execute Code | `executecode` | Enables use of code interpreter to execute python code in requests and LLM generated responses |
379379
| JSON | `json` | Enables structured outputs using the outlines library, supports pydantic types and JSON schema |
380380
| GenSelect | `genselect` | Generative Solution Selection - generates multiple candidates and selects the best based on quality criteria |
381+
| Web Search | `web_search` | Performs Google searches using Chrome automation (Selenium) to gather search results and URLs |
382+
| [Deep Research](optillm/plugins/deep_research) | `deep_research` | Implements Test-Time Diffusion Deep Researcher (TTD-DR) for comprehensive research reports using iterative refinement |
381383

382384
## Available parameters
383385

@@ -629,6 +631,7 @@ See `tests/README.md` for more details on the test structure and how to write ne
629631
- [Patched MOA: optimizing inference for diverse software development tasks](https://arxiv.org/abs/2407.18521) - [Implementation](optillm/moa.py)
630632
- [Patched RTC: evaluating LLMs for diverse software development tasks](https://arxiv.org/abs/2407.16557) - [Implementation](ptillm/rto.py)
631633
- [AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset](https://arxiv.org/abs/2504.16891) - [Implementation](optillm/plugins/genselect_plugin.py)
634+
- [Test-Time Diffusion Deep Researcher (TTD-DR): Think More, Research More, Answer Better!](https://arxiv.org/abs/2507.16075v1) - [Implementation](optillm/plugins/deep_research)
632635

633636
## Citation
634637

optillm.py

Lines changed: 43 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -773,7 +773,11 @@ def parse_args():
773773
if extra and extra[0]: # Check if there are choices for this argument
774774
parser.add_argument(arg, type=type_, default=default, help=help_text, choices=extra[0])
775775
else:
776-
parser.add_argument(arg, type=type_, default=default, help=help_text)
776+
if type_ == bool:
777+
# For boolean flags, use store_true action
778+
parser.add_argument(arg, action='store_true', default=default, help=help_text)
779+
else:
780+
parser.add_argument(arg, type=type_, default=default, help=help_text)
777781

778782
# Special handling for best_of_n to support both formats
779783
best_of_n_default = int(os.environ.get("OPTILLM_BEST_OF_N", 3))
@@ -855,12 +859,45 @@ def main():
855859
base_url = f"http://localhost:{port}/v1"
856860
logger.info(f"Launching Gradio interface connected to {base_url}")
857861

858-
# Launch Gradio interface
859-
demo = gr.load_chat(
860-
base_url,
861-
model=server_config['model'],
862-
token=None
862+
# Create custom chat function with extended timeout
863+
def chat_with_optillm(message, history):
864+
import httpx
865+
from openai import OpenAI
866+
867+
# Create client with extended timeout and no retries
868+
custom_client = OpenAI(
869+
api_key="optillm",
870+
base_url=base_url,
871+
timeout=httpx.Timeout(1800.0, connect=5.0), # 30 min timeout
872+
max_retries=0 # No retries - prevents duplicate requests
873+
)
874+
875+
# Convert history to messages format
876+
messages = []
877+
for h in history:
878+
if h[0]: # User message
879+
messages.append({"role": "user", "content": h[0]})
880+
if h[1]: # Assistant message
881+
messages.append({"role": "assistant", "content": h[1]})
882+
messages.append({"role": "user", "content": message})
883+
884+
# Make request
885+
try:
886+
response = custom_client.chat.completions.create(
887+
model=server_config['model'],
888+
messages=messages
889+
)
890+
return response.choices[0].message.content
891+
except Exception as e:
892+
return f"Error: {str(e)}"
893+
894+
# Create Gradio interface with queue for long operations
895+
demo = gr.ChatInterface(
896+
chat_with_optillm,
897+
title="OptILLM Chat Interface",
898+
description=f"Connected to OptILLM proxy at {base_url}"
863899
)
900+
demo.queue() # Enable queue to handle long operations properly
864901
demo.launch(server_name="0.0.0.0", share=False)
865902
except ImportError:
866903
logger.error("Gradio is required for GUI. Install it with: pip install gradio")
Lines changed: 263 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,263 @@
1+
# Deep Research Plugin
2+
3+
## Overview
4+
5+
The Deep Research plugin implements the **Test-Time Diffusion Deep Researcher (TTD-DR)** algorithm, a state-of-the-art approach for comprehensive research report generation. This implementation is based on the paper ["Deep Researcher with Test-Time Diffusion"](https://arxiv.org/abs/2507.16075v1) and provides iterative, in-depth research capabilities for complex queries.
6+
7+
## Algorithm Overview
8+
9+
The TTD-DR algorithm treats research as a **diffusion process** with iterative refinement through denoising and retrieval. Unlike traditional search approaches that return raw results, TTD-DR performs:
10+
11+
1. **Query Decomposition** - Breaks complex queries into focused sub-questions
12+
2. **Iterative Search** - Performs multiple rounds of web search based on identified gaps
13+
3. **Content Synthesis** - Uses advanced memory processing for unbounded context
14+
4. **Completeness Evaluation** - Automatically assesses research quality and identifies missing aspects
15+
5. **Report Generation** - Produces structured, academic-quality reports with proper citations
16+
17+
## Architecture
18+
19+
```
20+
deep_research/
21+
├── __init__.py # Package initialization
22+
├── research_engine.py # Core TTD-DR implementation
23+
└── README.md # This documentation
24+
25+
../deep_research_plugin.py # OptILLM plugin interface
26+
```
27+
28+
### Key Components
29+
30+
#### 1. `DeepResearcher` Class (`research_engine.py`)
31+
32+
The core implementation of the TTD-DR algorithm with the following key methods:
33+
34+
- **`decompose_query()`** - Implements query planning phase
35+
- **`perform_web_search()`** - Orchestrates web search using individual queries to avoid truncation
36+
- **`extract_and_fetch_urls()`** - Extracts sources and fetches content
37+
- **`synthesize_with_memory()`** - Processes unbounded context with citations
38+
- **`evaluate_completeness()`** - Assesses research gaps
39+
- **`generate_structured_report()`** - Creates academic-quality reports
40+
- **`research()`** - Main research loop implementing TTD-DR
41+
42+
#### 2. Plugin Interface (`deep_research_plugin.py`)
43+
44+
Minimal interface that integrates with OptILLM's plugin system:
45+
46+
```python
47+
def run(system_prompt: str, initial_query: str, client, model: str, request_config: Optional[Dict] = None) -> Tuple[str, int]
48+
```
49+
50+
## Implementation Details
51+
52+
### Research Process Flow
53+
54+
```mermaid
55+
graph TD
56+
A[Initial Query] --> B[Query Decomposition]
57+
B --> C[Web Search]
58+
C --> D[Content Extraction]
59+
D --> E[Memory Synthesis]
60+
E --> F[Completeness Evaluation]
61+
F --> G{Complete?}
62+
G -->|No| H[Generate Focused Queries]
63+
H --> C
64+
G -->|Yes| I[Generate Structured Report]
65+
I --> J[Final Report with Citations]
66+
```
67+
68+
### Citation System
69+
70+
The plugin implements a sophisticated citation tracking system:
71+
72+
- **Inline Citations**: `[1]`, `[2]`, `[3]` format throughout the text
73+
- **Source Tracking**: Maps citation numbers to source metadata
74+
- **Deduplication**: Avoids duplicate citations for the same URL
75+
- **Academic Format**: Proper reference formatting with URLs and access dates
76+
77+
### Report Structure
78+
79+
Generated reports follow academic standards:
80+
81+
1. **Executive Summary** - Key findings overview
82+
2. **Introduction** - Research question and significance
83+
3. **Background** - Context and foundational information
84+
4. **Key Findings** - Main discoveries with citations
85+
5. **Analysis and Discussion** - Interpretation and implications
86+
6. **Conclusion** - Summary and final thoughts
87+
7. **Recommendations** - Actionable suggestions (when applicable)
88+
8. **Limitations and Future Research** - Research constraints and future directions
89+
9. **References** - Complete source list with metadata
90+
91+
## Configuration
92+
93+
The plugin accepts the following configuration parameters:
94+
95+
```python
96+
request_config = {
97+
"max_iterations": 5, # Maximum research iterations (default: 5)
98+
"max_sources": 10 # Maximum sources per search (default: 10)
99+
}
100+
```
101+
102+
## Dependencies
103+
104+
The Deep Research plugin requires these OptILLM plugins:
105+
106+
- **`web_search`** - Chrome-based Google search automation
107+
- **`readurls`** - Content extraction from URLs
108+
- **`memory`** - Unbounded context processing and synthesis
109+
110+
## Usage Examples
111+
112+
### Basic Usage
113+
114+
```python
115+
from optillm.plugins.deep_research_plugin import run
116+
117+
result, tokens = run(
118+
system_prompt="You are a research assistant",
119+
initial_query="What are the latest advances in quantum error correction?",
120+
client=openai_client,
121+
model="gpt-4o-mini"
122+
)
123+
```
124+
125+
### Advanced Configuration
126+
127+
```python
128+
result, tokens = run(
129+
system_prompt="You are a research assistant",
130+
initial_query="Analyze the impact of AI on healthcare diagnostics",
131+
client=openai_client,
132+
model="gpt-4o-mini",
133+
request_config={
134+
"max_iterations": 3,
135+
"max_sources": 8
136+
}
137+
)
138+
```
139+
140+
### With OptILLM Server
141+
142+
```python
143+
from openai import OpenAI
144+
145+
client = OpenAI(base_url="http://localhost:8000/v1", api_key="optillm")
146+
147+
response = client.chat.completions.create(
148+
model="deep_research-gpt-4o-mini",
149+
messages=[
150+
{"role": "user", "content": "Research the latest developments in renewable energy storage"}
151+
],
152+
extra_body={
153+
"request_config": {
154+
"max_iterations": 3,
155+
"max_sources": 10
156+
}
157+
}
158+
)
159+
```
160+
161+
## Performance Characteristics
162+
163+
- **Time Complexity**: O(iterations × sources × content_size)
164+
- **Typical Duration**: 2-5 minutes per research query
165+
- **Token Usage**: 1,000-5,000 tokens per iteration
166+
- **Memory Requirements**: Scales with content volume and context size
167+
168+
## Reasoning Model Compatibility
169+
170+
The plugin is fully compatible with reasoning models that include internal thinking processes:
171+
172+
- **Automatic Cleanup**: Removes `<think>`, `<thinking>`, `<reasoning>`, `<reflection>` tags from all responses
173+
- **Professional Output**: Ensures final reports contain only clean, formatted content
174+
- **Seamless Integration**: Works transparently with any model type
175+
- **Supported Tags**: `<think>`, `<thinking>`, `<reasoning>`, `<thought>`, `<reflect>`, `<reflection>`
176+
177+
Example cleanup:
178+
```
179+
Input: "<think>Let me analyze this</think>\n\n# Research Report\nContent here..."
180+
Output: "# Research Report\nContent here..."
181+
```
182+
183+
## Error Handling
184+
185+
The plugin includes comprehensive error handling:
186+
187+
1. **Graceful Degradation** - Falls back to basic LLM response on critical failures
188+
2. **Timeout Management** - Handles web search and content fetching timeouts
189+
3. **Rate Limiting** - Includes delays to avoid search engine restrictions
190+
4. **Validation** - Input validation and configuration checks
191+
192+
## Quality Assurance
193+
194+
The implementation follows the TTD-DR paper's quality criteria:
195+
196+
- **Comprehensive Coverage** - Addresses all aspects of the research query
197+
- **Source Diversity** - Incorporates multiple credible sources
198+
- **Citation Accuracy** - Proper attribution for all claims and findings
199+
- **Academic Rigor** - Maintains objectivity and scholarly tone
200+
- **Iterative Refinement** - Continuously improves research quality
201+
- **Clean Output** - Automatically removes reasoning tags (`<think>`, `<thinking>`, etc.) for professional reports
202+
203+
## Comparison to Simple Search
204+
205+
| Feature | Simple Search | Deep Research (TTD-DR) |
206+
|---------|---------------|------------------------|
207+
| Query Processing | Single query | Multi-query decomposition |
208+
| Iteration | Single pass | Multiple refinement cycles |
209+
| Content Synthesis | Raw results | Comprehensive analysis |
210+
| Gap Detection | None | Automatic completeness evaluation |
211+
| Citations | Manual | Automatic with tracking |
212+
| Report Format | Unstructured | Academic report structure |
213+
| Context Handling | Limited | Unbounded via memory plugin |
214+
215+
## Future Enhancements
216+
217+
Potential improvements aligned with research directions:
218+
219+
1. **Parallel Processing** - Concurrent search execution
220+
2. **Domain Specialization** - Field-specific research strategies
221+
3. **Multimedia Integration** - Image and video content analysis
222+
4. **Real-time Updates** - Live research monitoring and updates
223+
5. **Collaborative Research** - Multi-agent research coordination
224+
225+
## Troubleshooting
226+
227+
### Common Issues
228+
229+
1. **Chrome Browser Issues**
230+
- Ensure Chrome is installed and accessible
231+
- Check for CAPTCHA requirements (plugin supports manual solving)
232+
233+
2. **Rate Limiting**
234+
- Plugin includes automatic delays
235+
- Consider increasing delay settings for aggressive rate limiting
236+
237+
3. **Memory Constraints**
238+
- Large research queries may consume significant memory
239+
- Monitor token usage and consider iteration limits
240+
241+
4. **Citation Extraction**
242+
- URL parsing depends on search result format
243+
- Plugin includes fallback parsing methods
244+
245+
5. **Search Query Processing**
246+
- Plugin uses individual searches for each sub-query to prevent truncation
247+
- If search results seem incomplete, check that decomposed queries are reasonable
248+
- Each sub-query is processed separately to ensure complete coverage
249+
250+
### Debug Mode
251+
252+
Enable debug output by checking the console logs during research execution. The plugin provides detailed logging of each research phase.
253+
254+
## Contributing
255+
256+
When contributing to the Deep Research plugin:
257+
258+
1. Maintain compatibility with the TTD-DR algorithm
259+
2. Preserve citation tracking functionality
260+
3. Ensure academic report structure compliance
261+
4. Test with various query types and complexity levels
262+
5. Document any new configuration options
263+
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
"""
2+
Deep Research Plugin Package
3+
4+
Implementation of Test-Time Diffusion Deep Researcher (TTD-DR) algorithm
5+
for comprehensive research report generation.
6+
"""
7+
8+
from .research_engine import DeepResearcher
9+
10+
__version__ = "1.0.0"
11+
__author__ = "OptILLM Contributors"
12+
__description__ = "TTD-DR Implementation for Deep Research"

0 commit comments

Comments
 (0)