A photo organization tool that sorts images by date into structured directories and handles HEIC-to-JPEG conversion. Designed for human readability, maintainability, and professional code quality.
- Standard Library First: Use Python's built-in modules (
logging,pathlib,dataclasses) instead of custom implementations - Type Safety: Full type hints on all functions for IDE support and catching errors early
- Single Responsibility: Each function does one thing well
- No Magic Values: All constants centralized in
config.py - Proper Logging: Use logging levels, not print statements
- Explicit Over Implicit: Clear function names, no hidden side effects
PythonPhotobook/
├── photobook.py # CLI entry point
├── modules/
│ ├── config.py # Centralized configuration
│ ├── logger.py # Logging setup
│ ├── directories.py # Path validation
│ ├── convert.py # HEIC → JPEG conversion
│ ├── sorting.py # File organization logic
│ └── integrity.py # Post-processing validation
├── requirements.txt # Python dependencies
└── README.md # User documentation
- Purpose: CLI argument parsing and workflow orchestration
- Key Functions:
main() - Dependencies: All modules
- Exit Codes:
0: Success1: Error (file not found, permissions, etc.)130: User cancelled (Ctrl+C)
- Purpose: Single source of truth for all constants
- Pattern: Frozen dataclass (immutable)
- Key Constants:
IMAGE_EXTENSIONS: Supported file typesIMAGES_DIR,UNSORTED_DIR: Output folder names- Date format strings
- JPEG quality setting
Why frozen dataclass?
- Prevents accidental modification
- IDE autocomplete support
- Clear structure vs scattered constants
- Purpose: Standard Python logging configuration
- Key Function:
setup_logger(name, log_file=None, level=INFO) - Features:
- Console output with timestamps
- Optional file logging
- Consistent format across all modules
Pattern: Each module creates its own logger:
logger = logging.getLogger(__name__)- Purpose: Validate source/destination before processing
- Key Function:
validate_paths(source_path, destination_path) - Checks:
- Source exists and is a directory
- Source is readable
- Destination exists (creates if missing)
- Destination is writable
Why separate module?
- Early validation prevents partial operations
- Reusable across different entry points
- Clear error messages with exception chaining
- Purpose: Convert Apple HEIC images to JPEG
- Key Function:
convert_heic_to_jpeg(src_path, delete_original=True) - Features:
- Preserves original modification time
- Configurable quality (from config)
- Optional original file deletion
- Skips if JPEG already exists
Important: delete_original parameter makes side effect explicit
- Purpose: Main file organization workflow
- Architecture: Small, focused functions instead of one large function
Date Extraction:
get_earliest_date(file_path): Get file's oldest timestampextract_date_from_filename(file_name): ParseYYYYMMDD_*patternupdate_file_date(file_path, date): Update file modification timeget_file_date(file_path, allow_update): Orchestrates date logic
Path Building:
build_destination_path(...): Construct output path with date foldersshould_convert_file(...): Check if HEIC conversion neededget_destination_for_file(...): Route images vs unsorted files
Processing:
collect_files_to_process(...): Count files for progress barprocess_file(...): Convert (if needed) and copy single fileorganize_files(...): Main orchestration function
Legacy Compatibility:
image_sorting(...): Wrapper maintaining old function signature
Before (AI-generated pattern):
def image_sorting(...): # 156 lines
# Walk directory tree to count files
for root, _, files in os.walk(source):
# ... 50 lines ...
# Walk directory tree AGAIN to process files
for root, _, files in os.walk(source):
# ... 100 lines of nested logic ...After (human-written pattern):
def organize_files(...):
total = collect_files_to_process(...) # One walk
for each file:
dest = get_destination_for_file(...)
process_file(src, dest, ...)Benefits:
- Single directory traversal
- Each function testable in isolation
- Clear data flow
- Easy to debug specific steps
- Purpose: Verify organization completed successfully
- Key Functions:
count_files_with_extensions(...): Count matching filescheck_file_counts(...): Verify source == destination countcheck_file_formats(...): Find unexpected file typesintegrity_check(...): Run all checks
Usage: Run after organization with -i flag
1. photobook.py: Parse arguments
2. logger.py: Setup logging
3. directories.py: Validate paths
4. sorting.py: organize_files()
├─ collect_files_to_process() → count
├─ For each file:
│ ├─ get_destination_for_file()
│ │ └─ build_destination_path()
│ │ └─ get_file_date()
│ └─ process_file()
│ └─ convert_heic_to_jpeg() [if needed]
└─ Copy to destination
5. Log success
1. photobook.py: Parse arguments with -i
2. logger.py: Setup logging
3. directories.py: Validate paths
4. integrity.py: integrity_check()
├─ check_file_counts()
│ └─ count_files_with_extensions()
├─ If mismatch: EXIT
└─ check_file_formats()
destination/
├── Images/
│ ├── 202501_January/
│ │ ├── 2025-01-15/
│ │ │ ├── photo1.jpg
│ │ │ └── photo2.png
│ │ └── 2025-01-20/
│ │ └── photo3.jpg
│ └── 202412_December/
│ └── 2024-12-25/
│ └── holiday.jpg
└── Unsorted_Files/
├── PDF/
│ └── document.pdf
└── TXT/
└── notes.txt
def build_destination_path(
file_path: Path,
dest_dir: Path,
allow_update: bool = True,
dest_basename: Optional[str] = None
) -> Optional[Path]:def process_file(...) -> bool:
if dest_path.exists():
return False # Early return
# Main logic continues
...
return True# Bad (AI-generated)
dest_path = os.path.join(dir, "Images", year, file)
# Good (human-written)
dest_path = dest_dir / "Images" / year / file_path.name# Bad (AI-generated)
except Exception as e:
print(f"Error: {e}")
# Good (human-written)
except FileNotFoundError:
logger.error(f"File not found: {path}")
return None
except OSError as e:
logger.error(f"File operation failed: {e}")
return Nonelogger.debug("File already exists") # Verbose detail
logger.info("Processing 100 files") # Normal operation
logger.warning("Conversion failed") # Recoverable issue
logger.error("Path not found") # Error condition
logger.exception("Unexpected error") # Error with traceback# tests/test_sorting.py
def test_extract_date_from_filename():
assert extract_date_from_filename("20250120_photo.jpg") == datetime(2025, 1, 20)
assert extract_date_from_filename("photo.jpg") is None
def test_build_destination_path(tmp_path):
result = build_destination_path(
Path("20250120_test.jpg"),
tmp_path
)
assert "202501_January" in str(result)
assert "2025-01-20" in str(result)- Create temp directory with test files
- Run organization
- Verify output structure
- Check file counts
- Validate no data loss
Before: 56-line Tee class reinventing logging module
After: 2-line logger setup using standard library
Before: 156-line image_sorting() doing everything
After: 10 focused functions with clear responsibilities
Before: "Images", "%Y%m_%B" scattered throughout
After: Centralized in config.py
Before: Walking directory tree twice After: Single traversal with count caching
Before: print(f"Error: {e}")
After: logger.error(f"Error: {e}")
Before: Lines 67-71 in integrity.py were debug code After: Removed entirely
Before: except Exception as e:
After: except (FileNotFoundError, OSError) as e:
- Single Directory Walk: Process files in one pass
- Lazy Conversion: HEIC converted only when needed
- Early Skip: Don't process files that already exist
- Progress Bar: Uses
tqdmfor user feedback
- Parallel Processing: Use
concurrent.futuresfor large directories - Database Tracking: SQLite for operation history
- Undo Functionality: Keep operation log for reversal
- Smart Duplicates: Hash-based duplicate detection
- Config File: YAML/TOML for user customization
- Exif Date: Use EXIF metadata for more accurate dates
- Unit Tests: Full test coverage with pytest
- Type Checking: Run mypy in CI
logger = setup_logger("photobook", log_file, level=logging.DEBUG)python3 photobook.py -s source -d dest --dry-runimport logging
logging.basicConfig(level=logging.DEBUG)
from modules.sorting import extract_date_from_filename
print(extract_date_from_filename("20250120_test.jpg"))- Line Length: ≤ 88 characters (Black default)
- Imports: Standard library → Third party → Local modules
- Docstrings: Brief description, parameters, return value
- Naming:
- Functions:
snake_case - Classes:
PascalCase - Constants:
UPPER_CASE - Private:
_leading_underscore
- Functions:
- pillow_heif: HEIC format support
- Pillow: Image processing
- tqdm: Progress bars
All other functionality uses standard library.
Last Updated: 2025-01-20 Refactored From: AI-generated code to human-written patterns Maintained By: Project owner