-
Notifications
You must be signed in to change notification settings - Fork 211
Feat deep research #216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Feat deep research #216
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Introduces `deep_research_plugin.py` and `web_search_plugin.py` to provide advanced research and web search capabilities. Updates `requirements.txt` to include selenium and webdriver-manager dependencies. Enhances plugin tests to cover the new plugins and updates expected plugin lists.
Introduces the Deep Research plugin based on the Test-Time Diffusion Deep Researcher (TTD-DR) algorithm, including core implementation, documentation, and OptILLM plugin interface. Adds new package files for query decomposition, iterative web search, synthesis, completeness evaluation, and structured report generation with citations. Updates .gitignore to exclude deep_research_reports/.
Changed all references from 'A Statistical Framework for Deep Researcher' to 'Deep Researcher with Test-Time Diffusion' and updated the associated arXiv URL in README, research_engine.py, and deep_research_plugin.py for accuracy and consistency.
Updated the research engine to perform web searches for each sub-query separately, preventing result truncation and improving coverage. The README was updated to document this change and provide guidance on search query processing.
Introduces a `clean_reasoning_tags` function to remove reasoning tags (e.g., <think>, <reflection>) from model responses for professional output. Updates DeepResearcher to apply this cleanup at key response stages and documents compatibility and cleanup behavior in the README.
Added the Test-Time Diffusion Deep Researcher (TTD-DR) algorithm to deep_research/research_engine.py, including draft generation, gap analysis, denoising, self-evolution, and finalization steps. Enhanced extract_search_queries in web_search_plugin.py to allow periods in queries, improving extraction for cases like 'Python 3.12'.
Added a robust cleanup function to remove all research placeholder tags from final reports. Improved gap analysis to prioritize placeholder tags and updated search logic to address high-priority gaps first. Increased default max_iterations and max_sources for more thorough research. Updated final report synthesis to ensure no placeholder tags remain.
Introduces BrowserSessionManager to enable reuse of a single browser session across multiple web searches, improving efficiency and reliability. DeepResearcher now uses a shared browser session for all search operations within a research run, and web_search_plugin's run function supports session reuse via the new manager.
Introduces session_state.py to manage browser sessions for concurrent deep research requests, ensuring thread safety and proper cleanup. Updates DeepResearcher to use unique session IDs and centralized session management, and improves search query extraction logic in web_search_plugin.py for more robust handling of search commands.
Extended timeout and retry logic for Gradio chat and deep research plugins to support long-running operations. Enhanced DeepResearcher prompts for more explicit gap analysis and research needs. Improved browser session recovery in web search plugin to handle invalidated sessions and prevent crashes. Updated default iteration and source limits for deep research to balance speed and coverage.
Introduces a set of sample research reports under optillm/plugins/deep_research/sample_reports. These reports cover topics such as TikTok bans, AI agent landscapes, unbanked market access, KKR's tech transactions, and more, providing detailed analyses and references for each subject.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Also fixes #97