lifelog-compression

Opinionated sparse visual extraction for lifelog-style video before upload.

The current goal is not to preserve smooth playback-quality video. The goal is to preserve useful visual information for downstream memory, search, coverage, and analysis workflows while making local preprocessing and upload much faster.

This repo is a reusable utility that can be embedded into or called from other programs before upload.

The tool is intentionally opinionated:

one input video file
one output visual bundle
preserve intrinsic media metadata from that video
very few public knobs

Current default recipe

The strongest practical default found so far is:

sample at 1 fps
extract images, not low-fps video
use Apple-native AVAssetImageGenerator on macOS
use requested time tolerance of +/- 0.5s
scale to fit inside 1920x1080
pad with black bars to a fixed 16:9 canvas
write JPEG frames
store a manifest of requested timestamps and actual timestamps

This preserves a regular sparse timeline while staying very fast on Apple Silicon.

Current package shape

The public surface is intentionally tiny:

one input video file
one output bundle folder
one opinionated extraction recipe

Commands:

cargo run -- spec
cargo run -- extract /path/to/video.mp4 /path/to/output-bundle
cargo run -- batch /path/to/video-folder /path/to/output-root

On macOS, the Rust crate compiles and invokes a tiny native Swift helper built on AVAssetImageGenerator. Rust owns the package shape and bundle emission; the native helper owns the actual frame extraction.

Library entry points:

extract_to_dir(input, output_dir)
extract_directory_to_dir(input_root, output_root)
load_bundle_metadata(bundle_dir)
load_manifest(bundle_dir)

Each bundle also keeps raw source metadata artifacts alongside the extracted frames:

metadata/source-metadata.json
metadata/source-probe.json
metadata/source-mdls.txt
metadata/source-xattrs.txt

Main findings

1. Sparse image extraction beats low-fps video transcoding

For Guardian-style use cases, sparse visual checkpoints are more useful than preserving a conventional video container.

That means the canonical derivative should be:

timestamped JPEG frames
plus metadata / manifest

instead of:

HEVC or H.264 low-fps video

2. Apple-native extraction is the current winner

Using a compiled native macOS utility built on AVAssetImageGenerator was dramatically faster than the earlier ffmpeg-based pipelines for this use case.

Important note:

early slow results were distorted by running the benchmark through the swift interpreter
once compiled as a native binary, the same extraction approach became much faster

3. Fixed output shape helps

The current preferred output shape is:

fixed 1920x1080
scale-to-fit
black padding bars instead of cropping

Reasons:

preserves all source pixels
makes downstream backend processing simpler
avoids source-specific layout branching
keeps the representation tool-friendly

4. Proxy media is a major bonus, but not required

Some cameras provide low-resolution companion media such as .lrv or similar preview files.

Those are excellent acceleration paths when available, but the core method should still work well on originals so the system is not dependent on vendor-specific support.

Benchmarks

These are the most important benchmark results so far.

Light original clip

Source:

1920x1440
24 fps
HEVC

Compiled Apple-native sparse extraction:

1080p, +/-0.5s: about 27.0x realtime
720p, +/-0.5s: about 31.0x realtime
540p, +/-0.5s: about 36.0x realtime

Heavy original clip

Source:

2704x2028
120 fps
HEVC

Compiled Apple-native sparse extraction:

1080p, +/-0.5s: about 34.9x realtime
720p, +/-0.5s: about 44.0x realtime
540p, +/-0.5s: about 53.2x realtime

For the heavy clip, timestamp drift remained small:

max absolute error: about 0.112s
mean absolute error: about 0.056s

Proxy result

On a matched proxy / preview file:

Apple-native extraction at 1080p, +/-0.5s: about 20.0x realtime

Proxy-first is still valuable, but the main discovery is that the general-purpose original-file path is already fast enough.

What did not win

Low-fps HEVC transcoding

Re-encoding to low-fps video was not the best representation or the best speed path for this problem.

ffmpeg exact-ish decode/filter pipelines

These were often much slower than the compiled Apple-native sparse extraction path.

Forward-only tolerance as the default

Testing 0 / +1.0s and 0 / +1.5s tolerance windows showed:

faster extraction on some light clips
but too many duplicated or collapsed actual timestamps

So the best current default remains:

symmetric +/- 0.5s

video-sampler as the main fast path

video-sampler was useful as an exploratory reference, especially for keyframe/dedup ideas, but it was not the raw speed winner for the core sparse extraction task.

Recommended product behavior

Current recommendation:

Run a quick local probe on each source file.
If a trusted preview / proxy companion exists, optionally use it.
Otherwise, use the general-purpose Apple-native sparse extraction path on the original.
Produce:
- padded 1920x1080 JPEG frames
- manifest with requested and actual timestamps
Upload the sparse visual bundle rather than a recompressed video derivative.

The embedding app or service can decide how to combine or upload many bundles, but that is outside the core tool itself.

Current implementation checks

Closed-loop validation against source is now in the repo via:

scripts/validate_against_source.py

Current native-package validation:

light clip fidelity:
- PSNR mean: 30.37 dB
- SSIM mean: 0.9803
heavy GoPro fidelity, first 30 frames:
- PSNR mean: 33.83 dB
- SSIM mean: 0.9747

Current release-binary extraction timings:

light clip (15s source): 1.23s wall time, about 12.2x realtime
heavy GoPro clip (112.7s source): 5.44s wall time, about 20.7x realtime

Future work

add a small library API on top of the current CLI path
benchmark folder-scale extraction on larger real datasets
decide whether 1080p should remain the default or whether 720p is the better product choice
optionally add proxy discovery rules for common camera ecosystems

Related notes

Detailed notes live in docs/technical-findings.md. The current real-workload benchmark lives in docs/benchmark-2026-03-30-real-workload.md.

Format spec

The first concrete interchange spec lives in docs/visual-bundle-v1.md.

Package shape

The repo includes a Rust crate plus CLI:

cargo run -- spec

Current public commands are intentionally minimal:

extract
spec
help

The CLI should stay small rather than becoming highly configurable.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
docs		docs
scripts		scripts
src		src
swift		swift
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

lifelog-compression

Current default recipe

Current package shape

Main findings

1. Sparse image extraction beats low-fps video transcoding

2. Apple-native extraction is the current winner

3. Fixed output shape helps

4. Proxy media is a major bonus, but not required

Benchmarks

Light original clip

Heavy original clip

Proxy result

What did not win

Low-fps HEVC transcoding

ffmpeg exact-ish decode/filter pipelines

Forward-only tolerance as the default

video-sampler as the main fast path

Recommended product behavior

Current implementation checks

Future work

Related notes

Format spec

Package shape

About

Uh oh!

Releases 3

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

lifelog-compression

Current default recipe

Current package shape

Main findings

1. Sparse image extraction beats low-fps video transcoding

2. Apple-native extraction is the current winner

3. Fixed output shape helps

4. Proxy media is a major bonus, but not required

Benchmarks

Light original clip

Heavy original clip

Proxy result

What did not win

Low-fps HEVC transcoding

ffmpeg exact-ish decode/filter pipelines

Forward-only tolerance as the default

video-sampler as the main fast path

Recommended product behavior

Current implementation checks

Future work

Related notes

Format spec

Package shape

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages