Merged
Conversation
- Add robust HTTP client with timeouts (10s connect, 30s total) and redirect handling - Implement comprehensive error classification (retryable vs permanent failures) - Add content decoding pipeline with charset detection and UTF-8 normalization - Support gzip/brotli/deflate compression and 5MB size limits - Create FetchPageJobHandler for background URL fetching - Add 13 comprehensive tests covering timeouts, redirects, compression, errors - Integrate with job runner system and database storage - Add dependencies: encoding_rs, chardetng, url, bytes, md5, once_cell Components: - src/fetcher/ - Core HTTP fetching module - src/jobs/handlers/fetch_page.rs - Job handler for background processing - tests/fetcher_client.rs - Comprehensive test suite - Updated SQLx offline query cache Co-authored-by: Amp <amp@ampcode.com> Amp-Thread-ID: https://ampcode.com/threads/T-2126140c-a3e6-48a0-a6cc-2e924d3c6344
7 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🌐 Network Fetcher for Page Content Implementation
This PR implements a robust HTTP client system for fetching and processing web page content as part of the background job system.
🎯 Features Implemented
✅ Robust HTTP Client
✅ Smart Content Processing
<meta charset>,<meta http-equiv>)✅ Error Classification System
✅ Job Runner Integration
FetchPageJobHandlerfor background URL processingcontentstable with metadata🏗️ Architecture
🔧 Technical Details
Dependencies Added:
encoding_rs- Character encoding conversionchardetng- Heuristic charset detectionurl,bytes,md5- Content processing utilitiesonce_cell- Singleton HTTP clientwiremock- Testing infrastructureDatabase Integration:
contentstablependingtofetchedFOR UPDATElocks🧪 Testing Coverage
9 Integration Tests (using wiremock):
4 Unit Tests:
🚀 Usage
Direct API:
Background Jobs:
Co-authored-by: Amp amp@ampcode.com
Amp-Thread-ID: https://ampcode.com/threads/T-2126140c-a3e6-48a0-a6cc-2e924d3c6344