x-reader/CLAUDE.md at main · mutjan/x-reader

Project

科技新闻选题聚合系统

一个面向内部编辑的科技新闻选题聚合系统，自动从RSS源获取科技新闻，通过AI评估筛选高价值选题，按领域和热度分类展示，帮助编辑团队快速发现优质报道线索。

Core Value: 自动发现高价值科技选题，降低编辑信息搜集成本，提升选题效率和质量。

Technology Stack

Languages

Python 3.x (3.7+) - Used for all backend logic, processing pipelines, and web interface
HTML/JavaScript - Used for frontend index.html

Runtime

Python 3.x
pip
Lockfile: Missing (only requirements.txt present)

Frameworks

Flask >=3.0.0 - Web administration interface
Not detected
Not detected (uses standard Python tooling)

Key Dependencies

requests >=2.31.0 - Network requests for fetching RSS/API content
beautifulsoup4 >=4.12.0 - HTML content parsing and extraction
pydantic >=2.5.0 + pydantic-settings >=2.1.0 - Configuration validation and management
python-dateutil >=2.8.2 - Date and time processing
tenacity >=8.2.3 - Retry logic for network operations
Git CLI - Used for GitHub Pages publishing

Configuration

Configured via pydantic-settings with environment variable override support
Key environment variables required: GITHUB_TOKEN, INOREADER_CLIENT_ID, INOREADER_CLIENT_SECRET
Main config file: src/config/settings.py
No build process (Python interpreted, static HTML frontend)

Platform Requirements

Python 3.7+
pip package manager
Git CLI
Processing scripts: Any Python 3 supported environment
Frontend: GitHub Pages hosting

Conventions

Naming Patterns

Lowercase with underscores (snake_case) pattern
Examples: ai_processor.py, news.py, github_pages.py
Lowercase with underscores (snake_case) pattern
Examples: get_unique_id(), build_prompt(), to_frontend_dict()
Lowercase with underscores (snake_case) pattern
Examples: normalized_url, content_hash, entity_mappings
PascalCase (upper camel case) pattern
Examples: RawNewsItem, ProcessedNewsItem, BaseAIProcessor

Code Style

Not detected (no prettier/black config found)
Observed 4-space indentation, consistent line breaks
Not detected (no flake8/pylint/ruff config found)

Import Organization

Example from src/processors/ai_processor.py:
No aliases used, full relative paths from project root

Error Handling

Limited explicit error handling observed
Uses return values and default fallbacks where appropriate

Logging

Logger initialized at module level with module name as identifier
Example: logger = setup_logger("ai_processor")

Comments

Module-level docstring at top of every file describing purpose
Class docstrings describing class responsibility
Method docstrings describing purpose, parameters, return values
Inline comments for non-obvious business logic
Not applicable (Python project, uses reST-style docstrings in triple quotes)

Function Design

Module Design

Architecture

Pattern Overview

Linear processing pipeline with distinct stages
Factory pattern for fetchers and publishers to support multiple data sources and output targets
Two-step AI processing workflow (prompt generation + manual result import) to avoid LLM API costs and provide full control
Abstract base classes for all core components to enable easy extension

Layers

Purpose: Retrieve raw news content from external sources
Location: src/fetchers/
Contains: Base fetcher interface, implementation for each data source, factory to instantiate fetchers
Depends on: External APIs (Twitter, Inoreader), src/models/news.py data models
Used by: Main pipeline orchestration
Purpose: Transform and enhance raw news data
Location: src/processors/
Contains: Duplicate remover, content filter, AI processor, zeitgeist analyzer
Depends on: src/models/news.py, utility functions
Used by: Main pipeline orchestration
Purpose: Define standardized data structures for raw and processed news
Location: src/models/
Contains: RawNewsItem, ProcessedNewsItem data classes and serialization logic
Depends on: No internal dependencies
Used by: All layers
Purpose: Export processed news to output targets
Location: src/publishers/
Contains: Base publisher interface, implementation for each output target, factory to instantiate publishers
Depends on: src/models/news.py, external services (GitHub Pages)
Used by: Main pipeline orchestration
Purpose: Provide user interface for interacting with the system
Location: src/web/
Contains: Flask web application, templates
Depends on: Processors, Models layers
Used by: End users via browser

Data Flow

Stateless pipeline processing with all state persisted to disk
Processed item IDs are stored to avoid reprocessing across runs
Temporary files are used for AI prompt/result exchange between system and user

Key Abstractions

Purpose: Standard interface for all data source fetchers
Examples: src/fetchers/base.py, src/fetchers/twitter_fetcher.py, src/fetchers/inoreader_fetcher.py
Pattern: Abstract base class with test_connection() and fetch() methods
Purpose: Standard interface for all output publishers
Examples: src/publishers/base.py, src/publishers/github_pages.py
Pattern: Abstract base class with publish() method
Purpose: Standard interface for AI processing implementations
Examples: src/processors/ai_processor.py
Pattern: Abstract base class with process_batch() method, supporting both manual and automated LLM workflows

Entry Points

Location: main.py
Triggers: Invoked from command line with parameters
Responsibilities: Orchestrate the entire processing pipeline, handle command line arguments, coordinate all components
Location: src/web/app.py
Triggers: Invoked via web server, accessed by users through browser
Responsibilities: Provide user interface for viewing news, managing processing workflow

Error Handling

Each stage validates input and returns early with clear logging if no items remain
Comprehensive logging at each step for debugging
JSON validation before publishing to ensure output consistency

Cross-Cutting Concerns

GSD Workflow Enforcement

Before using Edit, Write, or other file-changing tools, start work through a GSD command so planning artifacts and execution context stay in sync.

Use these entry points:

/gsd:quick for small fixes, doc updates, and ad-hoc tasks
/gsd:debug for investigation and bug fixing
/gsd:execute-phase for planned phase work

Do not make direct repo edits outside a GSD workflow unless the user explicitly asks to bypass it.

Developer Profile

Profile not yet configured. Run /gsd:profile-user to generate your developer profile. This section is managed by generate-claude-profile -- do not edit manually.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Project

Technology Stack

Languages

Runtime

Frameworks

Key Dependencies

Configuration

Platform Requirements

Conventions

Naming Patterns

Code Style

Import Organization

Error Handling

Logging

Comments

Function Design

Module Design

Architecture

Pattern Overview

Layers

Data Flow

Key Abstractions

Entry Points

Error Handling

Cross-Cutting Concerns

GSD Workflow Enforcement

Developer Profile

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

Project

Technology Stack

Languages

Runtime

Frameworks

Key Dependencies

Configuration

Platform Requirements

Conventions

Naming Patterns

Code Style

Import Organization

Error Handling

Logging

Comments

Function Design

Module Design

Architecture

Pattern Overview

Layers

Data Flow

Key Abstractions

Entry Points

Error Handling

Cross-Cutting Concerns

GSD Workflow Enforcement

Developer Profile