SmugVision Performance Analysis

Summary

Based on the log output analysis, here's where time is being spent during image processing:

Timing Breakdown (from logs)

Phase	Duration	Percentage	Issue
GPS Reverse Geocoding #1	~47s	35%	⚠️ MAJOR BOTTLENECK
User venue selection (interactive)	~28s	21%	Expected (user input)
Face detection	~6s	4.5%	Reasonable
GPS Reverse Geocoding #2	~47s	35%	⚠️ MAJOR BOTTLENECK
Caption generation (Llama)	~18s	13%	Expected (LLM inference)
Tags generation (Llama)	~6s	4.5%	Expected (LLM inference)
Face recognizer init	~1s	<1%	Reasonable
Model initialization	<1s	<1%	Reasonable

Total Processing Time: ~153s (2.5 minutes) Actual Processing (excluding user input): ~125s

Root Cause Analysis

🔴 Critical Issue: Reverse Geocoding Taking 47 Seconds

Location: smugvision/utils/exif.py, lines 390-426

Problem: The reverse_geocode() function has a catastrophically inefficient implementation:

It iterates through ~40 different venue types (restaurant, cafe, theater, school, etc.)
For each venue type, it makes a separate API call to Nominatim geocoding service
Each API call has a 5-second timeout
If even half the venue types are tried, that's 20+ API calls × 5 seconds = 100+ seconds potential

Code snippet causing the issue:

# Lines 373-385: Comprehensive venue type list (~40 types)
all_venue_types = [
    'restaurant', 'cafe', 'coffee', 'bar', 'pub', 'brewery',
    'theater', 'theatre', 'cinema', 'venue', 'hall', 'auditorium',
    'museum', 'gallery', 'library',
    # ... 40+ types total
]

# Lines 390-426: Loop making API call for EACH type
for search_term in all_venue_types:
    query = f"{search_term} near {latitude},{longitude}"
    search_results = geolocator.geocode(
        query,
        exactly_one=False,
        limit=5,
        timeout=5  # 5 seconds per venue type!
    )

Why This Happens Twice

First call (in test_vision.py): Lines 78-86, called with interactive=True for user selection
Second call (in process_image): Called again inside the vision model processing

Recommendations

Immediate Fix (High Priority)

Option 1: Use Nominatim's nearby search properly Instead of searching for each venue type individually, use a single reverse() call with better parameters, or use Overpass API for nearby POI search.

Option 2: Cache results The function is being called twice with the same coordinates. Cache the result from the first call.

Option 3: Reduce timeout 5 seconds per venue type is excessive. Reduce to 2 seconds.

Option 4: Limit venue types Don't search all 40 venue types. Search only the most common ones (top 5-10).

Option 5: Use concurrent requests If multiple searches are needed, use ThreadPoolExecutor to parallelize API calls.

Proposed Optimized Implementation

Replace the sequential venue search with:

Single reverse geocode call (already done at line 352)
If building name not found, make a single Overpass API query for all POI types within radius
Or use Nominatim's lookup endpoint for nearby POIs in one call

Expected Performance After Fix

GPS reverse geocoding: 47s → 2-5s (90-95% reduction)
Total processing time: 153s → ~35s (excluding user input)
Interactive mode: 153s → ~63s (including user input)

Monitoring

Run the updated test_vision.py script which now includes detailed timing breakdowns:

./test_vision.py <image_path>

The script will output a timing breakdown showing exactly where time is spent in each phase:

⏱️  TIMING BREAKDOWN
============================================================
2. EXIF Location Extraction................... 47.23s (35.2%)
4. Total Image Processing..................... 53.45s (39.8%)
3. Face Recognizer Initialization..............  0.54s ( 0.4%)
1. Model Initialization........................  0.25s ( 0.2%)
------------------------------------------------------------
TOTAL.......................................... 134.2s
============================================================

Additional Notes

Llama vision model inference (18s caption + 6s tags) is reasonable for local inference
Face detection (6s) is acceptable
The 94 seconds spent on GPS geocoding (2 × 47s) represents 70% of non-interactive time
Fixing the reverse geocoding will make the overall process 4× faster

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SmugVision Performance Analysis

Summary

Timing Breakdown (from logs)

Root Cause Analysis

🔴 Critical Issue: Reverse Geocoding Taking 47 Seconds

Why This Happens Twice

Recommendations

Immediate Fix (High Priority)

Proposed Optimized Implementation

Expected Performance After Fix

Monitoring

Additional Notes

FilesExpand file tree

TIMING_ANALYSIS.md

Latest commit

History

TIMING_ANALYSIS.md

File metadata and controls

SmugVision Performance Analysis

Summary

Timing Breakdown (from logs)

Root Cause Analysis

🔴 Critical Issue: Reverse Geocoding Taking 47 Seconds

Why This Happens Twice

Recommendations

Immediate Fix (High Priority)

Proposed Optimized Implementation

Expected Performance After Fix

Monitoring

Additional Notes