Skip to content

Latest commit

 

History

History
407 lines (302 loc) · 10.8 KB

File metadata and controls

407 lines (302 loc) · 10.8 KB

HeyFoS — Progress Update (Session 3)

Date: January 25, 2026
Status: ✅ FIRST FOCUS STACKING COMPLETE! 🎉


🚀 MAJOR MILESTONE ACHIEVED

✅ Full Focus Stacking Pipeline Working!

Processed: 19 TIFF images (4256×2832, ~35MB each)
Method: Laplacian focus measure + Max fusion
Output: stacked_result.tiff (184MB, 32-bit float)
Total Time: ~49 seconds (loading + processing + fusion + save)


What Was Built (Session 3)

1. StackProcessor Module ✅

Complete focus stacking pipeline:

  • Load multiple images from directory
  • Compute focus measures for all images
  • Max fusion algorithm (select sharpest pixel from each image)
  • Automatic output generation

2. Updated CLI Tool ✅

  • Automatic directory detection
  • Batch processing of image stacks
  • Fallback to test mode if no directory found
  • Verbose progress logging

Test Results with Real Images

Input Stack

Directory: tiff-samples/
Images: 19 TIFF files
Resolution: 4256×2832 (12.05 megapixels)
Format: TIFF (likely 8-bit or 16-bit RGB)
Total size: ~700MB input

Processing Pipeline

[1] Loading images         → 6 seconds (19 × 4256×2832)
[2] Focus measures (GPU)   → 1 second  (19 × Laplacian shader)
[3] Max fusion (CPU)       → 41 seconds (per-pixel selection)
[4] Save TIFF              → 1 second
────────────────────────────────────────
Total:                      ~49 seconds

Performance Breakdown

  • Image loading: ~315ms per image (CoreGraphics → Metal texture)
  • Focus measure: ~50ms per image (GPU Laplacian kernel)
  • Max fusion: 41s for 12MP × 19 images (CPU implementation)

Key Observations

Performance Bottleneck: Max Fusion (CPU)

  • Current implementation: CPU-based per-pixel selection
  • Processing: 12.05 megapixels × 19 images = 229 million comparisons
  • Time: 41 seconds → 5.6 million pixels/sec
  • Optimization opportunity: Move to Metal GPU shader

GPU Utilization

  • ✅ Focus measure: GPU-accelerated (very fast)
  • ❌ Max fusion: CPU-only (slow for large images)
  • 📈 Potential speedup: 10-20× with Metal shader for fusion

What Works Now (End of Session 3)

  1. Complete pipeline: Directory → Load → Focus measure → Fusion → Save
  2. Real image processing: 19× 4256×2832 TIFF images
  3. Max fusion algorithm: Per-pixel selection based on focus score
  4. Laplacian focus measure: GPU-accelerated, tested with real data
  5. Large image support: 12MP images processed successfully
  6. Verbose logging: Progress tracking for all steps

Next Optimizations (Priority)

Immediate (Week 3)

  1. GPU-based max fusion — Port fusion to Metal shader

    • Expected speedup: 10-20×
    • Target: < 5s for 19×12MP fusion
  2. Memory optimization — Stream processing to reduce peak RAM

    • Current: All 19 images loaded in memory (~2GB)
    • Target: Process in batches or tiles
  3. Alignment module — Add image registration

    • Feature detection (AKAZE/ORB)
    • ECC or phase correlation
    • Sub-pixel warp

Medium-term (Week 4-6)

  • Pyramid blending (replace max fusion for better quality)
  • Deghosting for moving objects
  • RAW file testing (when you have NEF samples)

Files Created (Session 3)

Sources/HeyFoSCore/Processing/
└── StackProcessor.swift              ✅ NEW (250 lines)
    - loadImagesFromDirectory()
    - computeFocusMeasures()
    - maxFusion()
    - processStack() full pipeline

Sources/HeyFoSCLI/
└── main.swift                        ✅ UPDATED
    - Auto-detect directory vs test mode
    - Call StackProcessor for real stacks

Output:

  • stacked_result.tiff — First successful focus-stacked image! 🎉

Performance Metrics

Operation Time Throughput
Load 1 image (12MP) 315ms 38 MP/s
Focus measure (GPU) 50ms 240 MP/s
Max fusion (CPU) 41s 5.6 MP/s
Save TIFF (12MP) 1s 12 MP/s

Bottleneck: Max fusion (CPU) — needs GPU implementation


Memory Usage

Peak RAM: ~2.5GB (estimated)
- 19 images × 12MP × 16 bytes (float32 RGBA) = ~1.5GB
- 19 focus maps × 12MP × 16 bytes = ~1.5GB
- Total: ~3GB (within 24GB limit)

For larger stacks or higher resolution:

  • Implement tiling (process in chunks)
  • Or batch processing (process N images at a time)

Code Statistics (Total Project)

  • Swift code: ~1200 lines
  • C++ code: ~250 lines (LibRaw wrapper)
  • Metal shaders: ~200 lines (5 kernels)
  • Build time: 7s (debug), 12s (release first build)

Conclusion

Session 3 achievements:

  • ✅ Complete focus stacking pipeline working
  • ✅ Tested with 19 real TIFF images (4256×2832)
  • ✅ Max fusion algorithm implemented
  • ✅ Output: 184MB stacked TIFF

Project status: 40% complete

  • Foundation + Focus Measure + Basic Stacking: DONE
  • Next: Optimization + Alignment + Pyramid Blending

Performance: Good but can be 10× faster with GPU fusion


Ready for optimization & alignment! 🚀
See PLAN.md for Week 3-4 roadmap.


Session History

  • Session 1: Swift/Metal foundation, Metal shaders compiled
  • Session 2: LibRaw integration, ImageLoader, Focus measure tested
  • Session 3: StackProcessor, Max fusion, Real image stack processed ✅

🎉 Major Achievements

✅ LibRaw C++ Integration Complete

  • Created C wrapper (CLibRaw) for LibRaw C++ API
  • Bridging header with all essential functions
  • Swift LibRawDecoder successfully wraps C API
  • Ready to decode RAW files: CR3, NEF, ARW, DNG, RAF, ORF, RW2

✅ Image Loading Pipeline

  • ImageLoader.swift — Universal image loader
    • RAW files → LibRaw → float32 RGB → Metal texture
    • Standard formats (JPEG/PNG/TIFF) → CoreGraphics → Metal texture
    • Synthetic test images (checkerboard pattern)
    • Save Metal texture → TIFF file

✅ Focus Measure Working on GPU

  • Laplacian focus measure ✅ Tested
  • Tenengrad focus measure ✅ Tested
  • Runs on Apple M2 GPU via Metal
  • Processing speed: < 0.3s for 1024×768 image

Test Results

Synthetic Image Test

swift run heyfos-cli --input ./test_data --output ./focus_map.tiff --method laplacian --verbose

Output:

  • ✅ Metal initialized: Apple M2
  • ✅ Test image created: 1024×768 (checkerboard)
  • ✅ Converted to grayscale
  • ✅ Focus measure computed (Laplacian)
  • ✅ Saved to TIFF (12MB, 32-bit float)
  • Total time: ~0.3 seconds

Methods Tested

  1. Laplacian — 3×3 kernel, edge detection ✅
  2. Tenengrad — Sobel gradient magnitude ✅

Both methods produce valid focus maps!


Files Created (Session 2)

Sources/
├── CLibRaw/                                # C++ wrapper module
│   ├── include/
│   │   ├── libraw_wrapper.h                ✅ C API header
│   │   └── module.modulemap                ✅ Swift module map
│   └── libraw_wrapper.cpp                  ✅ C++ implementation
│
├── HeyFoSCore/
│   ├── RAW/
│   │   └── LibRawWrapper.swift             ✅ Swift wrapper (updated)
│   └── Processing/
│       └── ImageLoader.swift               ✅ NEW: Image loader
│
└── HeyFoSCLI/
    └── main.swift                          ✅ Updated with test pipeline

Code Statistics

  • Lines of C++ code: ~250 (LibRaw wrapper)
  • Lines of Swift code: ~400 (ImageLoader + updates)
  • Metal shaders: 5 kernels (from session 1)
  • Build time: 3.6s (debug), 5.9s (release first build)
  • Executable size: 15.8 MB (debug)

Technical Details

LibRaw Integration

LibRaw 0.22.0 (Homebrew)
Location: /opt/homebrew/Cellar/libraw/
Linked: libraw.24.dylib

C Wrapper Functions:
- heyfos_libraw_init()
- heyfos_libraw_open_file()
- heyfos_libraw_unpack()
- heyfos_libraw_process_linear()
- heyfos_libraw_get_image_data()
- heyfos_libraw_get_metadata()

Focus Measure Performance

Image Size: 1024×768 (786,432 pixels)
Method: Laplacian (Metal GPU)
Time: < 0.3s (includes texture upload/download)
Throughput: ~2.6 megapixels/sec

For comparison:

  • CPU-only (NumPy/Python): ~1-2s estimated
  • GPU speedup: 3-7× faster

What Works Now

  1. Metal GPU compute shaders (Laplacian, Tenengrad, Gaussian, Grayscale, Blend)
  2. LibRaw integration (decode any RAW format to float32)
  3. Image loading (RAW, JPEG, PNG, TIFF → Metal texture)
  4. Focus measure (GPU-accelerated, tested with synthetic images)
  5. TIFF export (Metal texture → 32-bit float TIFF)
  6. CLI tool (working, verbose logging, multiple methods)

Next Steps (See PLAN.md)

Immediate (Week 2)

  1. Test with real RAW files

    • Get sample CR3/NEF/ARW files
    • Test LibRaw decoding
    • Verify color accuracy
  2. Alignment Module (Week 3-4)

    • Feature detection (AKAZE/ORB in Metal)
    • ECC or phase correlation
    • Sub-pixel warp
  3. Max Fusion (Week 4)

    • For each pixel, select frame with highest focus score
    • Simple but effective first version

Medium-term (Week 5-8)

  • Pyramid blending (5-level Laplacian pyramid)
  • Deghosting (optical flow + masking)
  • Memory optimization (tiling for large images)

Known Issues / TODOs

  • Haven't tested with real RAW files yet (need sample data)
  • LibRaw dylib version warning (linked 15.0, building 14.0) — cosmetic only
  • No alignment yet (images must be pre-aligned)
  • No actual focus stacking yet (just focus measure)
  • Tests disabled (need XCTest setup)

Performance Notes

Metal GPU Utilization

  • Laplacian shader: 3×3 kernel, 9 texture reads per pixel
  • Thread group size: 16×16 (256 threads)
  • Texture format: rgba32Float (16 bytes/pixel)
  • Memory bandwidth: ~1.2 GB/s for 1024×768 (estimated)

Optimization Opportunities

  • Use shared memory (threadgroup memory) for convolution
  • Texture cache optimization (coalesce reads)
  • Pipeline parallelism (overlap compute + memory transfer)

Build & Run

# Build (debug)
swift build

# Build (release, optimized)
swift build -c release

# Run test
swift run heyfos-cli \
  --input ./test_data \
  --output ./focus_map.tiff \
  --method laplacian \
  --verbose

# Check output
file focus_map.tiff
open focus_map.tiff  # Preview in macOS

Conclusion

Session 2 achievements:

  • ✅ LibRaw integration (C++ ↔ Swift bridging)
  • ✅ Image loading pipeline (RAW + standard formats)
  • ✅ Focus measure working on M2 GPU
  • ✅ End-to-end test (synthetic → focus map → TIFF)

Project status: 25% complete (Foundation + Focus Measure done)

Next milestone: Alignment module + real image stack test


Ready for Week 3: Alignment! 🚀
See PLAN.md for detailed roadmap.