Claude/implement paper cli app 016 whhs xv zua6u je ss2 pir xd#1
Open
Claude/implement paper cli app 016 whhs xv zua6u je ss2 pir xd#1
Conversation
Complete implementation of arXiv:2507.16075 - "Deep Researcher with Test-Time Diffusion" by Han et al. (Google Cloud AI Research). Features: - Three-stage research pipeline (Plan, Search & Synthesis, Report) - Denoising with Retrieval algorithm (Algorithm 1 from paper) - Component-wise Self-Evolution for quality optimization - RAG-based answer synthesis - Configurable LLM providers (OpenAI, Anthropic) - CLI interface with interactive mode - Comprehensive output management (reports, drafts, search history) Components: - src/ttd_dr_agent.py: Core TTD-DR agent implementation - src/llm_client.py: LLM client abstraction - src/search_tool.py: Web search functionality - src/prompts.py: All prompts for each stage - src/utils.py: Utility functions - main.py: CLI application - config/config.yaml: Configuration file Documentation: - README.md: Complete documentation and usage guide - QUICKSTART.md: Quick start guide - IMPLEMENTATION.md: Technical implementation details - example.py: Programmatic usage example - test_setup.py: Installation verification script The implementation faithfully follows the paper's methodology: - Report-level denoising with retrieval (Section 2.3) - Self-evolution with environmental feedback (Section 2.2) - Hyperparameters matching Table 4 from the paper - Three-stage backbone agent (Section 2.1)
Complete TypeScript port of the Test-Time Diffusion Deep Researcher framework from arXiv:2507.16075. Features: - Full TypeScript with strict typing and modern ES2020+ features - Same TTD-DR algorithms as Python version - Three-stage research pipeline (Plan, Search & Synthesis, Report) - Denoising with Retrieval (Algorithm 1) - Component-wise Self-Evolution (Section 2.2) - RAG-based answer synthesis - CLI with interactive mode - Comprehensive output management Components: - src/types.ts: TypeScript type definitions - src/llm-client.ts: OpenAI & Anthropic client abstraction - src/search-tool.ts: Web search functionality - src/prompts.ts: All prompts for each stage - src/ttd-dr-agent.ts: Core TTD-DR agent implementation - src/utils.ts: Utilities and output management - src/main.ts: CLI application - src/index.ts: Main exports - src/example.ts: Programmatic usage example - src/test-setup.ts: Installation verification Configuration: - package.json: npm dependencies and scripts - tsconfig.json: TypeScript compiler configuration - config/config.yaml: TTD-DR algorithm configuration Documentation: - README.md: Complete TypeScript-specific documentation TypeScript advantages: - Full static type checking - Better IDE support and autocomplete - Modern async/await patterns - Compile-time error detection - Strong typing for agent config and state
Enhance TTD-DR with multiple professional search providers for better web research capabilities. New Features: - SerpAPI integration for production-quality Google/Bing search - Playwright browser automation for full page content extraction - HTML to Markdown conversion for clean content - Flexible search provider architecture - Enhanced search configuration options Components Added: - src/enhanced-search-tool.ts: Multi-provider search implementation - SerpAPIProvider: Google/Bing API search - PlaywrightProvider: Real browser with content extraction - DuckDuckGoProvider: Free fallback option - EnhancedSearchTool: Unified interface for all providers - src/search-interface.ts: Common ISearchTool interface - SEARCH_PROVIDERS.md: Comprehensive provider guide Updated Files: - package.json: Added playwright, serpapi, cheerio, turndown - config/config.yaml: Search provider configuration options - src/types.ts: SearchConfig with provider-specific options - src/main.ts: Use EnhancedSearchTool with config-driven provider - src/ttd-dr-agent.ts: Accept ISearchTool interface - src/index.ts: Export new search providers - .env.example: Added SERPAPI_API_KEY - README.md: Document search provider options Search Provider Capabilities: 1. SerpAPI: - Professional Google/Bing search results - Reliable and fast - Requires API key ($50/month for 5k searches) 2. Playwright: - Full page content extraction - JavaScript rendering - HTML to Markdown conversion - No API key needed - Great for academic papers and documentation 3. DuckDuckGo: - Free and simple - No API key required - Good for development/testing Usage: Just change search.provider in config.yaml to switch between "serpapi", "playwright", or "duckduckgo" - no code changes needed! See SEARCH_PROVIDERS.md for detailed comparison and setup guide.
DuckDuckGo's free API is unreliable and often returns HTML instead of JSON, causing parse errors. This fix makes the provider robust and provides clear guidance to users. Fixes: - Enhanced error handling for DuckDuckGo provider - Check content-type header before parsing - Validate response is JSON before parsing - Graceful fallback to mock results with helpful messages - Added User-Agent header to improve API success rate - Added query parameters (no_html, skip_disambig) for better results Changes: - src/enhanced-search-tool.ts: - DuckDuckGoProvider now handles HTML responses gracefully - Added getFallbackResults() with user guidance - Clear warning messages suggesting better alternatives - Won't crash the app when DuckDuckGo fails - config/config.yaml: - Changed default provider from "serpapi" to "playwright" - Playwright is free, reliable, and works without API keys - Added comments warning about DuckDuckGo reliability - TROUBLESHOOTING.md: - Comprehensive troubleshooting guide - DuckDuckGo HTML error documented with solutions - Quick fixes for all common issues - Provider recommendations by use case - Performance optimization tips User Experience: Instead of crashing with: "FetchError: invalid json response body" Users now see: "⚠️ DuckDuckGo API is currently unavailable. For production research, please: 1. Use SerpAPI provider (high-quality results) 2. Use Playwright provider (free, full content extraction) Change provider in config/config.yaml" Recommendation: - Development/Testing: Use Playwright (free, reliable) - Production: Use SerpAPI (paid, best quality) - DuckDuckGo: Only for basic testing (unreliable)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.